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Process for partitioning of molecules 
Field of the invention 

5 

The present invention relates to isolation and purification of proteins in aqueous two- 
phase systems (ATPS). Specifically the invention provides processes for partitioning of 
molecules of interest in ATPS by fusing said molecules to targeting proteins which have 
the ability of carrying said molecule into one of the phases. 

10 

Background of the invention 

Liquid-liquid extraction in an aqueous two-phase system (ATPS) can offer a powerful 
technique for isolation and purification of proteins. The separation of macromolecules 
15 and particles by means of liquid-liquid extraction is well known (Albertsson, 1986; 
Walter a/., 1985; Kula, 1990). Mainly polyethylene glycol (PEG) - salt, PEG-dextran 
and PEG-starch systems have been in use. More recently detergents and detergents with 
reversed solubility were discovered as suitable methods for separation of macromole- 
cules, and especially for the separation of proteins. 

20 

An advantage of aqueous two-phase systems (ATPS) is that they arc especially suited 
for large scale processing of microbial proteins not only from culture supernatants but 
also from crude extracts containing cells and cell debris (Kula, 1979; Kula, 1985). 
Characteristic features of biological fluids as well as suspensions are rather small 

25 particle sizes, low density differences between fluid and suspended solids, high 
viscosities of the extracts and high compressibility of the solids (Hustedt et a/,, 1985; 
Bender and Koglin, 1986). These attributes decrease the performance of conventional 
methods for solid-liquid separation like centrifugation and filtration at the beginning of 
a protein recovery process. Using an aqueous two-phase system removal of solids can 

30 be integrated into a liquid-liquid separation step, clarification is thus combined with an 
initial purification (Kula, 1979; Kula, 1985). 

After the extraction process phase separation can be accomplished by settling under 
gravity as well as by centrifugation (Kula, 1985). ATPS can be applied in various scales 



from very small laboratory scale to large industrial scale thus suiting for various 
proteins, purposes and needs. With regard to industrial purposes commercially available 
centrifugal separators can be used to shorten separation time. Several authors have 
investigated the potential of centrifugal separators of various design for processing of 
5 large volumes of aqueous two phase systems (Kula, 1979; Kula et al., 1981, Kula et aL, 
1982; Kula, 1985). In these studies the authors have used polymer/polymer or 
polymer/salt systems and the results of these investigations demonstrate the feasibility 
of continuous separation of aqueous two-phase systems in centrifugal separators. 

10 Extraction systems based on nonionic surfactants have been described as an alternative 
to standard polymer/polymer or polymer/salt systems. Phase forming surfactants arc e.g. 
polyoxy ethylene type nonionic detergents. The basis of this type of aqueous two-phase 
system is the temperature-dependent reversible hydration of the polar ethylene oxide 
head groups. The temperature at which the phase separation occurs is refcned to as the 

15 cloud-point (cloud-point extraction). This kind of aqueous two-phase system is 
especially suited for the extraction of amphiphilic biomolecules. The potential of this 
type of two-phase system for separating membrane bound proteins from cytosolic and 
peripheral membrane proteins was first demonstrated by Bordicr (1981). Heusch and 
Kopp (1988) have been able to demonstrate that lamellar structures formed in the 

20 miscibility gaps of polyglycol ether / water systems are responsible for the selective 
extraction of hydrophobic substances. 

Recently, the successful application of a surfactant -based aqueous two-phase system 
for the extraction of a membrane bound protein (cholesterol oxidase) from the 

25 unclarified culture medium of the gram-positive microorganism Nocardia rhodochrous 
on a bench scale has been reported (Minuth et al., 1995). By addition of only one 
chemical compound a product release through solubilization was possible in homogene- 
ous phase and in a second step a clarification as well as an initial purification was 
achieved by an extraction process at elevated temperatures separating the detergent rich 

30 phase. A closed concept was further developed for the production of the membrane 
bound enzyme by surfactant -based extraction, organic solvent extraction and anion- 
exchange chromatography, which gave a product suitable for analytical applications 
(Minuth et aL, 1996). 



3 

In aqueous two phase systems the desired target e.g. a protein should partition 
selectively into One phase (preferentially the lighter phase) white the other substances 
should partition into the other phase (preferentially the heavier phase). In PEG/salt and 
PEG/dextran and similar systems there are several driving forces for a substance like 

5 charges, hydrophobic, hydrophilic forces or the dependence on conformation or ligand 
interaction (Albertsson, 1986). The forces leading to separation in detergent based 
aqueous two phase systems are suggested to be primarily hydrophobic (Terstappcn et 
al,, 1993). Even if a lot of work has been carried out in the field of prediction in ATPS, 
none of the designed models provides a physical picture of the phase behaviour and 

10 prediction is hardly possible (Johansson et al., 1998). 



15 



In ATPS the partitioning coefficient is defined as the concentration (activity in case of 
an enzyme) of the target in the top phase divided by the concentration (enzyme: activity) 
of the target protein in the bottom phase. 



C: 



Yield: is defined as the amount of target in the top phase divided by the sum of the 
amount of target in top and bottom. This leads to the following equation 
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If the desired substance is directed to the heavier phase (as it can be the case using 
Triton) the yield is defined by 
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The volume ratio of the two coexisting phases are defined by the volumes of the lighter 
over the heavier phase, respectively. 
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4 

An example of useful proteins facing problems in purification in a cost-effective way 
are the commonly used industrial enzymes used as biocatalysts, the glycosyl hydrolases, 
proteases and lipases produced by fungi and bacteria. These are used in e.g. laundry, 
textile, paper and pulp, food and feed industry. The fact that microbes produce many 
5 different enzymes during their growth and the fact that some of these may be undesircd 
in certain applications leads to a need to enrich the active component(s). This 
enrichment can be performed by choosing appropriate growth conditions, by genetic 
engineering and/or by down-stream processing (e.g. purification of the active 
component(s)). 

10 

Purification of proteins are generally performed by chromatography. Usually gel- 
chromatographic methods are used based on ion-exchange, hydrophobic interaction, 
affinity chromatography and molecular sieving. Methods like electrophoresis and 
crystallisation can also be used. These methods are well known in the art and suitable 
15 for proteins of fairly high market value. In case of bulky enzyme production these 
methods, however, are too expensive in order to keep the final product on a compatible 
price level. Due to similar properties of these enzymes several purification steps are 
usually needed to separate the proteins from each other. This often causes low final 
yields and therefore a high loss of product. 

20 

Many extracellular hydrolases produced by the filamentous fungus Trichoderma are 
currently used in different industrial applications in large scale. These hydrolases are e.g. 
hemicellulases (such as xylanases and mannanascs), cellulases (such as endoglucanases 
and cellobiohydrolases) and proteases. Purification of these is well known in the art 
25 (Bhikhabhai et q/., 1984; Pere et aL, 1995), but for large industrial applications the 
purification methods are too expensive. Alternative methods to enrich these hydrolases 
have been used, including deletion of undcsired genes by genetic engineering (Suominen 
et aL, 1992). However, even after extensive genetic engineering some minor undesired 
activities may still be present in the final product. 

30 

ATPS have been studied in purification of cellulases of T. reesei and the purification of 
an endoglucanase III showed some promising results, enriching the yield of the protein 
in the upper phase (U.S. Pat. 5,139,943). ATPS have also been studied in purification 



5 

of lipases, endoxylanase and natamycin (EP 0 574 050 Al). No K and Y values are, 
however, mentioned. 



As in other protein purification methods, similar properties of proteins produced by an 
5 organism impair also in ATPS, e.g. selective separation of one protein is not achieved 
optimally. To obtain selectivity in purification affinity chomatographic methods are used 
especially for analytic purposes and in purification of high- value products. These 
include immunoaffinity chromatography and various fusion protein strategies well 
described in the art such as fusing the protein of interest to an other protein (e.g. 

10 glutathione-S-transferase), protein domain (e.g. protein A-ZZ domain) or small peptide 
(e.g. His-tag), which selectively bind to the solid canier and thus the recovery of the 
fusion partner is obtained as well. The fusion protein can be suitable for the particular 
purpose as such or cleavage of the product from the added fusion partner may be 
desired. There arc well-known methods in the art on cleavage of fusion proteins from 

15 their partners by proteases, e.g. by factor X, thrombin or papain or by genetically 
introducing a protease cleavage site (e.g. Kex2 site) or autoprosessing domains (e.g. 
Intein, New England Biolabs) or by chemical cleavage. 

ATPS offer advantages mainly with respect to technology compared with the solid state 
20 based separation systems e.g. affinity column-based techniques. The scale-up of 
extractive enzymes is relatively simple utilising commercially available equipment and 
machinery common in the chemical industry. In addition, it can be used in a continuous 
process and it can be relatively cost extensive. It can be used as a single step for 
clarification, concentration and purification. ATPS can be used as a first capture-step, 
25 but for bulk products often no further purification is needed. 



To aid selective separation in two-phase systems, recent publications have described the 
fusion of small peptide tags of 12 amino acids to the protein to be purified. The most 
successful of these soluble peptides are containing tryptophans. So far they have mainly 
30 been applied for very small molecules like the staphylococcal protein A derivative ZZTO 
(Berggren et al., 1999; Hassinen et at., 1994; Kohler et al., 1991). 



6 

Use of ATPS has so far been limited to certain targets. Due to the advantages of ATPS 
in protein separation, purification and localisation, highly selective and powerful 
n\ethods should be developed. This is especially important for large scale processes 
where ATPS in general is very inexpensive as a first capture step or as the only step for 
5 purification, clarification and concentration .The system should be universal so that the 
technique would be strong enough to mediate separation of in principle any component 
to the desired phase irrespective of its size or biochemical properties. 

Description of the invention 

10 

In this invention we describe selective separation and partitioning of molecules and 
particles fusing them with targeting proteins having the capability to carry the molecule 
or particle of interest to the desired phase in ATPS, and to keep it in this phase if 
wanted. This invention is directed to make ATPS usable for every biotechnological 
15 product. By addition of the targeting protein to selective products, either by genetic 
tagging of proteins, by chemical binding, glueing or by use of any other technique, the 
product molecule can be turned more suitable for separation in ATPS. Using ATPS the 
product or certain component is therefore driven to one phase while the other 
components or by-products are directed to the other phase(s). 

20 

We also describe that efficient separation in ATPS can be obtained using targeting 
proteins which are/can be larger than the described small soluble synthetic peptide tags 
of 12 amino acids or less. These targeting molecules can aid in separating of small 
molecules but even large proteins and particles. Unlike the small peptide tags, it is not 

25 necessary that they contain tryptophan residues, although they may do so. They can be 
hydrophobic or moderately hydrophobic and/or amphipathic in nature, either in 
monomeric form or when forming aggregates. Such proteins can be found in nature or 
they can be designed, or obtained through for instance methods known in art for mutant 
generation, gene shuffling or directed evolution. Suitable targeting molecules can be 

30 screened for instance by fusion the product of interest to a library of natural or mutant 
sequences, and screening the ability of the fusion molecules to separate in ATPS, 
Furthermore, any molecule capable of separating in ATPS is a suitable targeting 
molecule. 



Several techniques exploiting purified protein for isolation of the corresponding gene 
may be used to find genes encoding suitable targeting molecules for ATPS. Suitable 
proteins or polypeptides may be purified on the basis of their properties. They can be 
obtained by applying the cells, cell extracts or culture media to ATPS and recovering 
the proteins or peptides separated into the phase containing the heavier hydrophobic 
phase material. Suitable targeting molecules may also be recovered for example from 
the culture medium foam formed either during the cultivation of a microorganism or 
caused by bubbling gas through the medium. Proteins and peptides suitable as targeting 
molecules may further be recovered from aggregates caused by freezing of culture 
media. After the targeting molecules have been purified, the corresponding genes arc 
isolated using techniques known to a person skilled in the art. Such techniques include 
for example screening of expression libraries using antibodies raised against purified 
polypeptide or peptide, and PCR cloning and screening of genomic and/or cDNA 
libraries with oligonucleotides designed on the basis of N-terminal or internal protein 
sequences. 

Examples of molecules suited as targeting proteins in ATPS found in nature are 
hydrophobin-like small proteins. Hydrophobins are secreted proteins with interesting 
physico-chemical properties that have recently been discoverd from filamentous fungi 
(Wesscis, 1994; Wosten and Wessels, 1997; Kershaw and Talbot, 1998). One 
characteristic feature of these proteins is their moderate hydrophobicity. They are usually 
small proteins, approximately 70 to 160 amino acids, containing eight cysteine residues 
in conserved pattern, and do usually not contain tryptophanes. However, also 
multimodular proteins with one or several hydrophobin domains and e.g. proline-rich 
or asparagine/glycine repeats, or hydrophobins containing less than eight cysteine 
residues have been characterized (Lora et al., 1994; Lora et aL, 1995; Amtz and 
Tudzynski, 1997). Hydrophobins have been divided into two classes based on their 
hydropathy profiles (Wessels, 1994). 

Today most protein data exists for the hydrophobins Sc3p of Schizophyllum commune 
(class I), and cerato-ulmin of Opkiostoma ulmi and cryparin oi Cryponectria parasitica 
(class II), although more than 30 gene sequences for hydrophobins have been published 
(Wosten and Wessels, 1997). HFB genes are often naturally highly expressed but due 



8 

to special requirements in cultivation conditions and the biochemical properties of the 
proteins, purification of HFBs in large amounts have been difficult. For instance only 
relatively moderate production levels of a few mg per liter of Sc3 hydrophobin in static 
cultures are obtained (Han Wostcn, personal communication). Published purification 
5 procedures include e.g. multi-step extraction from fungal cell walls using organic 
solvents and bubbling or freezing of culture filtrates (Wessels, 1994). No reports of 
successful production of hydrophobins are available; levels of cerato-ulmin were no 
higher than those obtained with other naturally occuring fijngal isolates (Temple et aL, 
1997). 

10 

Upon shaking hydrophobin-containing solutions, the protein monomers form rodlet-likc 
aggregates. These structures are similar to the ones found on surfaces of aerial stuctures. 
The self-assembly of purified Sc3 hydrophobin into a 10 nm thick amphipatic layer on 
hydrophilic and hydrophobic surfaces has been demonstrated (Wosten et ai, 1994a; 
15 Wosten et at., 1994b). This film is very strongly attached to the surface and not broken, 
for instance, by hot detergent. The hydrophobic side of the layer on hydrophilic surfaces 
shows properties similar to teflon (Wessels, 1994). The Sc3 assemblages, as well as 
those of cerato-ulmin and cryparin, also form on gas-liquid or gas-air interphases thus 
stabilizing air bubbles or oil droplets in water, 

20 

Surface activity of proteins is generally low but hydrophobins belong to surface-active 
molecules, their surfactant capacity being at least similar to traditional biosurfactants 
such as glycolipids, lipopetides/lipoproteins, phospholipids, neutral lipids and fatty acids 
(Wosten and Wessels, 1997). In fact Sc3 hydrophobin is the most potent biosurfactant 
25 known. It lowers the water surface tension to 24 mJm" at a concentration of 50 ^g/ml 
due to a conformational change during self-assembly of monomers into an amphipathic 
film (Wosten and Wessels, 1997). 

Hydrophobin-likc molecules vary in their properties. For instance, rodlet-forming 
30 capacity has not been assigned for all hydrophobins (such as some class 11), or they 
might have a weaker tendency to form stable aggregates (Russo et at., 1992; Carpenter 
ef a/., 1992). Another group of fungal amphiphatic proteins are repellents (Wosten et al., 
1996 (Ustilago), for review, see Kershaw and Talbot, 1998). Consequently, other type 
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of proteins suited as targeting proteins for ATPS, may have only some of the features 
assigned to hydrophobins. Other suitable proteins are hydrophobic ones such as e.g. 
lipases, cholesterol oxidase, membrane proteins, small peptide drugs, aggregating cell 
wall proteins, lipopetidcs or any parts of these or combinations of these, and other 
5 molecules like glycolipids, phospholipids, neutral lipids, fatty acids in combination with 
proteins or peptides. 

In this invention the targeting protein, such as a hydrophobin-like protein or parts of it, 
is bound to the product molecule or the component to be separated. First, phase forming 

10 materials and eventually possibly also additional salts are added to a watery solution 
containing the fusion molecule or component, and optionally also the contaminating 
materials. TTie added agents are mixed to facilitate their solubilization. As soon as they 
arc solubilized the two phases are formed either by gravity settling or centrifugation. In 
the separation the target protein drives the product to for instance the detergent-rich 

15 phase which could either be the top or the bottom phase. The method is not only useful 
for purification of products of interest but also for keeping the product or the component 
of interest, such as a biocatalyst, in a particular phase which enables certain useful 
biotechnical reactions. 

20 Several ATPS systems are suitable for performing this invention. These include PEG 
containing systems, detergent based systems and novel thermoseparating polymers. 
Detergent based systems can be nonionic, zwitterionic, anionic or kationic. The system 
can be based on amphiphilic polymeric detergents, micelle forming polymers. Novel 
polymers can be based on polyethylene-polypropylene copolymers such as pluronic 

25 block copolymers, Brij, polyoxyethylene derivatives of partial ethers of fatty acids made 
by adding polyoxyethylene chains to the nonesterified body and polyoxyethylene 
derivatives. The well known PEG/salt, PEG/dextran and PEG/starch (or derivatives such 
as Reppal, hydroxipropyl starch) systems where PEG and water are forming the top 
phase and dextran/starch/salt and water arc forming the bottom phase. As salts arc used 

30 phosphate, citrate, sulfate or others. In the present process the target is partitioning 
mainly to the top phase, white most of the contaminants are separating mainly to the 
bottom phase. Some hydrophobic contaminants might partition to the top phase as well. 
Using detergent based systems only one phase forming detergent has to be added. 



f 
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optionally, salts and other chemicals can be used in addition. The mentioned chemicals 
are added, and the solution is mixed. After mixing the separation can take place either 
by centrifugation or gravity settling. In order to separate into two phases the temperature 
of the solution has to be over the cloud-point of the detergent. The solution has to be 
heated if the cloud-point is not reached otherwise. If wanted, a second separation step 
can follow after a first extraction step and the product rich phase can be further purified. 
Also the remaining product in the product poor and by-product rich phase can be re- 
extracted. Very good K values can be obtained and the yields and concentration factors 
are high. 

The process of the present invention can advantageously be used in laboratory scale but 
is especially suitable for large scale separations. It can successfully be used in the 
separation of proteins and components from large fermentations. Using genetic 
modifications, the method can be used to purify any protein of interest including 
extracellular enzymes and proteins such as cellulases and hemicellulases from mixtures 
containing large amounts of protein such as several grams per liter. Furthermore, this 
separation can be obtained from various culture media including industrial media 
containing particular materials such as cellulose and spent grain. The method can be 
used to purify the product from culture media of strains modified not to produce 
endogenous hydrophobins. The separation can be done directly from the fermentation 
broth which can additionally contain cells, even viscous filamentous fungi. High biomass 
levels are acceptable for the process as explained in example 9. An example is the 
extracellular endoglucanase I from the fungus Trichoderma reesei which can be tagged 
for instance with the class 2 HFBI and can for example be separated with the nonionic 
polyoxyethylene C12-C18E05. In this example the detergent rich phase is the lighter 
phase and contains most of the tagged endoglucanase while most of the other cellulases, 
proteases and other enzymes remain in the heavier phase. The mycelium separates to the 
bottom phase, too. The separation can be achieved using separation temperatures higher 
than 25°C, The temperature can be decreased if certain salts like NaCl or K2SO4 are 
added. 

The invention describes separation of molecules produced in various different organisms 
such as bacteria, yeast and filamentous fungi. The invention is suitable for purification 



u 

of product molecules from extra- or intracellular locations, including cell wall bound 
molecules. It provides examples how the fusion molecule can be secreted by these 
different organisms but also provide an example how the fusion can be produced 
intracellularly. 

5 

The invention further describes how fusion molecules consisting of several domains can 
be constructed and successfully expressed and produced. The invention describes fusions 
of the targeting molecule to a small protein (CBD), to a moderately sized protein (EG!) 
and to a huge highly glycosylated protein (FIoI), and different domain variations of 
10 these. These molecules can be ready as such for biotechnical use. Alternatively, the 
product can be cleaved from the targeting protein by any method known in the art such 
as with proteases e.g. thrombin, factor X, papain or by chemical cleavage. Furthermore^ 
ATPS is a preferential means to be used to separate the product from the targeting 
protein after cleavage, or these can be separated with other methods known in the art. 

15 

A suprising feature is that the targeting protein can also be used to carry large particles 
to the desired phase in ATPS. This can be obtained if the particles already contain 
proteins suited for targeting such as spores/conidia do in case of fungi. The targeting 
protein can also be attached to the particles or compounds in vitro. If cells are separated, 

20 the targeting protein can alternatively be expressed in the recombinant cells in such a 
way that it is exposed at the cell surface whereby it mediates the separation of the cells 
in ATPS. A teaching how this can be done is provided in example 22. Other types of 
molecules which direct the targeting molecule to the cell surface can be found e.g. in 
the literature including bacterial outer membrane proteins and lipoproteins (StShl and 

25 Uhlen, 1997), and yeast proteins a-agglutinin and flocculin (Schreuder et ai, 1996; 
Klis et ai (1994) WO 94/01567; Frenken (1994) WO 94/18330). 

A further advantage of the system is that the invention combined with ATPS provides 
a means to separate the product or desired component not only from other unnecessary 
30 or unwanted proteins but also from harmful proteins such as proteases as described in 
example 6. Thus, the invention is particularly suited for production and purification of 
heterologous proteins, e.g. sensitive mammalian proteins usually produced in limited 
amounts in heterologous hosts. Such proteins arc for instance antibodies or fragments 
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thereof, interferon, interlcukin, oxidative enzymes and any foreign protein which can 
Otherwise be produced in a host. It is possible that separation of the product from e.g. 
culture medium can also be obtained on-line or semi-continuously, thus minimising the 
effect of proteases or other harmful components present in the culture. When produced 
5 intracellularly, the invention also provides means to separate the heterologous product, 
for instance the inclusion bodies it may form, from the cellular extracts. 

This invention describes for the first time that fusion proteins containing hydrophobin- 
like molecules can be made and produced in significant amounts despite the very 
10 particular properties of hydrohobin-Iike molecules. Importantly, this invention also 
describes how recombinant strains producing increased amounts of hydrophobin-like 
proteins as such can be made. This provides means to produce the targeting protein for 
uses in which it is wanted that the targeting protein is bound to the product or particle 
in vitrOf to enable further separation of such molecules or particles in ATPS. 

15 

Importantly, this invention also decsribcs how hydrophobin-like molecules can be 
purified in ATPS very efficiently. The molecules can be separated in the same way as 
the above mentioned fusions, for instance by PEG systems of by detergent-based 
systems. Separation can be done from the culture medium or from cells. This provides 
20 a significant improvement in making pure preparates containing hydrophobin-like 
molecules since due to their properties their purification is very complicated and results 
in losses with the previously reported techniques as described above. 

The invention is further illustrated by the following Examples which describe 
25 construction of the fusion molecules of the invention, and partitioning of the molecules 
of interest using the process according to the invention. 
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EXAMPLES 
Example 1 

Construction of vectors for expression of EGI and EGIcore HFBI fusion proteins 
5 under the cbkl and gpdl promoters of Tr'tchoderma and gpdA promoter of 
Aspergillus 

For construction of an EGI-HFBI fusion protein, hfbl (SEQ ID 1) coding region (from 
Ser-23 to the STOP codon) and a peptide linker (Vai Pro Arg Gly Ser Ser Ser Gly Thr 

10 Ala Pro Gly Gly) preceeding it was amplified with PGR using pTNS9 as a template and 
as a 5' primer TCG GG C ACT ACQ TG C GAG TAT AGC AAC GAG TAC TAG 
TGG CAA TGC CTT GTT CCG CGT GGC TCT ACT TCT GGA ACC GCA (SEQ 
ID 2) and as a 3' primer TCG TAC GGA TCC TCA AGC ACC GAC GGC GGT (SEQ 
ID 3). pTNS9 has been decribed in detail in Example 19. The sequence in bold in the 

15 5' primer encodes 16 C-terminal residues of EGI/The sequence in italics is a thrombin 
cleavage site and the underlined CACTACGTG is a Dralll site. The underlined 
GGATCC in the 3' primer is a BamHl site. The 280 bp PGR fragment was purified 
from agarose gel and ligated to pGEM-T T/A vector (Promega) resulting in pMQl02. 

20 For construction of an EGIcore-HFBI fusion protein, the hfbl coding region (as above) 
was amplified with PCR using pTNS9 as a template and as a 5' primer ACT ACA 
CGG AG G AGC TC G ACG ACT TCG AGC AGC CCG AGC TGC ACG GAG 
AGC AAC GGC AAC GGC (SEQ ID 4) and as a 3' primer SEQ ID 3. The sequence 
in bold in the 5' primer encodes amino acids 410-425 in EGI and the underlined 

25 GAGCTC is a Sad site. The 260 bp PCR fragment was purified from agarose gel and 
ligated to pPCRII T/A vector (Invitrogen) resulting in pMQUl. 

In the next step Trichoderma expression vectors for production of EGI-HFBI and 
EGIcore-HFBI fusion proteins under the control of cbhl promoter and terminator 
30 sequences were constructed. The expression vector used as a backbone in the constructs 
is pPLE3 (Nakari et al. (1994) WO 94/04673) which contains a pUC18 backbone, and 
carries the cbhl promoter (SEQ ID 5) inserted at the EcoRI site. The cbhl promoter is 
operably linked to the full length egll cDNA (SEQ ID 6) coding sequence and to the 
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cbhl transcriptional terminator (SEQ ID 7). The plasmid pMQl02 was digested with 
Drain and BamHI and the released 280 bp fragment containing hfbl and linker 
sequences was purified from agarose gel and ligated to pPLE3 digested with Dralll and 
BamHI. The plasmid pMQUl was digested with Sad and BamHI and the 260 bp 
fragment containing the hfbl sequence was ligated to pPLE3 digested with Sad and 
BamHI, The resulting plasmids pMQl03 (Figure 1) and pMQllS (Figure 2) carry the 
coding sequences for fuU-length EGI linked to HFBI via a peptide linker and for 
EGIcore linked to HFBI via its own linker region, respectively, under the control of 
cbhl promoter and terminator sequences. 

Trichoderma expression vectors for production of EGI-HFBt and EGlcorc-HFBI fusion 
proteins under the control of gpdl promoter and terminator sequences of Trichoderma 
and gpdA promoter and trpC terminator sequences of Aspergillus were constructed as 
follows. A Sacll site was inserted inbetween the Xbal and Pad sites of pMV4 using as 
an adapter annealed primers TAA CCG CGG T (SEQ ID 8) and CTA GAC CGC GGT 
TAA T (SEQ ID 9). The resulting plasmid is pMVQ. pMV4 contains a pNEB193 (New 
England Biolabs) backbone, and carries a 1.2 kb Trichoderma gpdl promoter (SEQ ID 
10) and a 1.1 gpdl terminator (SEQ ID 11) inserted at Sall-Xbal and BamHI-Asd 
sites, respectively. The expression cassettes for EG I- HFBI and EGlcorc-HFBl were 
released from pMQl03 and pMQ113 with SacII and BamHI, purified from agarose gel 
and ligated to pMVQ cut with SacII and BamHI. The resulting plasmids pMQl04 
(Figure 3) and pMQ114 (Figure 4) carry the EGI- HFBI and EGIcore-HFBI cassettes, 
respectively, under the control of Trichoderma gpdl transcriptional control sequences. 
Expression plasmids pMQlOS (Figure 5) and pMQllS (Figure 6) containing EGI -HFBI 
and EGIcore-HFBI cassettes, respectively, operably linked to the gpdA promoter and 
trpC terminator of Aspergillus were constructed. EGI-HFBI and EGIcore-HFBI 
cassettes were released from plasmids pMQ104 and pMQll4 with Xbal and BamHI, 
blunted with T4 DNA polymerase and ligated to Ncol digesteded and T4 DNA 
polymerase treated pAN52-l (SEQ ID 12). pAN52-l contains a pUClS backbone, and 
carries a 2.3 kb gpdA promoter and a 0.7 kb trpC terminator sequences of A, nidulans. 
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Example 2 

Construction of vectors for over-production of HFBI on cellulase-inducing and - 
repressing media 

5 For over-expression of HFBI under cbhl promoter the protein coding region of hfbl 
was amplified with PGR using as a template pEAlO (Nakari-Setala et ai, 1996). pEAlO 
carries a 5.8 kb genomic Sail fragment containing hfbl coding and flanking sequences. 
GTC AA C CGC GG A CTG CGC ATC ATG AAG TTC TTC GCC ATC (SEQ ID 
13) was used as a 5* primer in the PGR and as a 3' primer SEQ ID 3. The sequence 

10 in bold in the 5' primer is 21 bp of cbhl promoter adjacent to translational start site of 
the corresponding gene and the underlined CCGCGG is a Kspl site. The obtained 
fragment of 430 bp was digested with Kspl and BamHI and ligated to pMQl03 digested 
with Kspl and BamHI. The resulting plasmid pMQ121 (Figure 7) carries the coding 
sequence of hfbl operably linked to cbhl transcriptional control sequences. pEAlO 

15 plasmid is used for over-production of HFBI in cellulase-repressing conditions. 

Example 3 

Transformation of Trichoderma and purification of the EGl-HFBI and EGlcore- 
HFBI producing and HFBI over-producing clones 

20 

Trichoderma reesei strains QM9414 (VTT-D-74075) and Rut-C30 (VTT-D-86271) 
were co- transformed essentially as described (Penttila et ai, 1987) using 3-13 fig of the 
plasmids pMQ103, pMQll3, pMQ104, pMQlM, pMQ105, pMQ115, pMQ121 and 
pEAlO and as the selection piasmids 1-3 fig pToC202, p3SR2 or pAR021. pToC202 

25 (pUC19 backbone) and p3SR2 (pBR322 backbone) plasmids carry 2.7 kb Xbal and 5 
kb EcoRI-Sall genomic fragments of A. nidulans, respectively, containing the amdS 
gene (Hynes et al, 1983; Tilburn et ai., 1983). pAR021 is essentially the same as 
pRLMex30 (Mach et ai 1994) and carries the E. coli hph gene operably linked to 730 
bp of pfdl promoter and 1 kb oicbh2 terminator sequences of T. reesei. The Amd+ and 

30 Hyg+ transformants obtained were streaked three times onto plates containing acetamidc 
and hygromycin, respectively (Penttila et al., 1987). Thereafter spore suspensions were 
made from transformants grown on Potato Dextrose agar (Difco). 
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The production of the fusion proteins EGI-HFBI and EGIcore-HFBI and HFBl was 
tested by slot blotting or Western analysis with EGI and HFBI specific antibodies from 
shake flask or microtiter plate cultivations carried out in minimal medium supplemented 
with cither glucose, lactose or a mixture of Solka flock cellulose and/or spent grain 
5 and/or whey. The spore suspensions of the fusion protein producing clones were purified 
to single spore cultures on selection plates (containing either acetamide or hygromycin). 
To determine the best producers, production of the fusion proteins was analyzed again 
from these purified clones as described above, 

10 T. reesei strains selected for further fermentor cultivations are VTr-D-98692 (pEAlO), 
VTr-D-98492 (pMQ121), VTr-D-98693 (pMQ103), VTT-D-98691 (pMQll3), 
VTT-D-98681 (pMQlOS) and VTT-D-98682 (pMQ115). These strains have QM9414 
as the host strain. VTr-D-997Q2 (pMQll3) has Rut-C3Q as the host strain. 

15 Example 4 

Cultivation of the EGI-HFBI and EGIcore-HFBI protein producing and HFBI 
over-producing Trichoderma strains 

EGI-HFBI and EGIcore-HFBI fusions were produced under the cbhl promoter in a 15- 
20 litre fermenter using T. reesei strains VTT-D-98693 (pMQ103) and VTT-D-98691 
(pMQ113), respectively. Strains were grown 5 days on minimal medium (Penttila et ai, 
1987) containing 4% Solka flock cellulose (James River Corporation, Berlin, NH) and 
2 % spent grain (Primalco, Koskenkorva, Finland). EGIcore-HFBI was also produced 
in fermenter (15 1) using the Rut-C30 strain VTT-D-99702 (pMQ113) with 4 % 
25 lactose medium. To induce the production of EGI-HFBI and EGIcore-HFBI fusions 
under Aspergillus gpd A promoter, T. reesei strains VTT-D-98681 (pMQ105) and VTT- 
D-98682 (pMQll5) were cultivated in 15-litre fermenter. Strains were grown 3 to 5 
days on minimal medium supplemented with 2% glucose, 0.2% Peptone, and 0.1% 
Yeast Extract, and with glucose feed to maintain the glucose concentration in the range 
30 of 1 to 3% throughout the cultivation. HFBI over-producing strain VTT-D-98692 
(pEAlO) was grown similarly in 15 I on glucose medium and the strain VTT-D-98492 
(pQM121) over-producing HFBI under cbhj promoter was cultivated for 7 days in 15- 
litre fermentor on medium containing 4% Solka flock and 2% spent grain. The control 
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cultivations with the host strains of the trans form ants, QM9414 (VTT-D-74075) and 
Rut-C30 (VTT-D-86271), were carried out on media containing i) Solka flock 
cellulose and either spent grain or whey, ii) lactose and iii) glucose similarly as 
described above. 

5 

When proper some T. reesei transformant strains and their host strains were also 
cultivated at 28°C in shake flasks for 5 to 6 days in 50 to 150 ml volume of 
Trichoderma minimal medium (Pcnttila et ai, 1987) suplemented with cither 3% Solka 
flock cellulose and 1% spent grain or 3-4% glucose with glucose feeding. 

10 

Example 5 

Standard separation assays and analysis 

If not otherwise stated the standard ATPS and subsequent analyses and calculations were 
15 carried out as explained in this example. 

In general whole fermentation broth, supernatant (biomass separated by centrifugation 
or filtration) or purified proteins in buffer were separated in 10 ml graduated tubes. First 
detergent was added into the tubes and the tubes were then filled to 10 mg with protein 

20 containing liquid. The amount of detergent in the tube was calculated in weight percent 
of detergents. After thorough mixing in an overhead shaker the separation took place by 
either gravity settling in a water bath at constant temperature or by centrifugation at 
constant temperature. The separation usually was performed at 30**C, the standard 
amount of detergent used was 2-5% (w/v). After separation the volume ratio was noted 

25 and samples were taken from the lighter and heavier phase for analysis. 

Two-phase separations were analysed qualitatively by using SDS-PAGE gels followed 
by visualization of the fusion proteins with Coomassie brilliant blue R-250 (Sigma) or 
Western blotting. Polyclonal anti-HFBI antibody were used in Western analysis for 
30 detection of EGlcore-HFBI, EGI-HFBI and dCBD-HFBI proteins together with alkaline 
phosphatase conjugated anti-rabbit IgG (Bio-Rad). Alkaline phosphatase activity was 
detected colorimetrically with BCIP (5-bromo-4-chloro-3-indolyl-phosphate) used 
in conjunction with NBT (nitro blue tetrazolium) (Promega). 



18 

Contaminating endogenous EGI, CBHI and EGIII in the top phase was tested with 
appropriate antibodies. Acidic protease activity in the top and bottom phase was also 
tested using the SAP method (Food Chemicals Codex, p. 496-497, 1981), which is 
based on the 30 min enzymatic hydrolysis of a hemoglobin substrate. All reactions were 
5 performed at pH 4.7 and 40 °C, Unhydrolyzed substrate was precipitated with 14% TCA 
and removed by filtration. The released tyrosine and tryptophan was determined 
spectrophotometrically. Total protein concentrations were determined by Non -Interfering 
Protein Assay*^ (Geno Technology, Inc). 

10 EGI activity was detected using 4-methylumbelliferyl-B-D-cellobioside (MUC) (Sigma 
M 6018) as substrate (Van Tilbeurgh H. & Caeyssens M., 1985; Van Tilbeurgh etai, 
1982), EGI hydrolyses the C-glycosidic bond and fluorogenic 4-methylumbelliferone 
is released, which can be measured using a fluorometer equipped with a 360 excitation 
filter and a 455 nm emission filter. CBHI also hydrolyses the substrate and it was 

15 inhibited by addition of cellobiose (C-7252, Sigma). EGI containing liquid was added 
in an appropriate dilution to a buffer containing 50 mM sodium acetate buffer (pH 5), 
0.6 mM MUC and 4.6 mM cellobiose. The mixture was heated to 50°C. The reaction 
was stopped after ten minutes using 2% NajCOj, pH 10. Purified CBHI was detected 
using the same assay as for EGI without the addition of the inhibitor cellobiose. 

20 

The partition coefficient K was defined as the ratio of the measured concentrations or 
activities in the top and bottom phase, respectively. 
The Yield Y was defined as follows: 

r = L_ 

^ r r 

25 1+ Kb.,1. 

where Yj is the Yield of the top phase, V,, and V^ are the volumes of top and bottom 
phase, respectively. The Yield of the bottom phase can be described accordingly. 

30 The mass balances, e.g. recovery of all added protein, were always checked for 
completeness to ensure no artificially high Yield (e.g. due to possible inactivation of the 
protein in the bottom phase). The values were usually calculated based on total enzyme 
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activity (EGI wt plus the EGI-fusion) and thus the values are underestimated for the 
separation of the fusion as demonstrated in Example 16. 

Example 6 

5 Small scale ATPS separation studies and gel analysis 

EGI-HFBI and EGIcore-HFBI fusions produced under the cbhl promoter in a 15-litre 
fermenter on Solka flock cellulose and spent grain medium as described in Example 4 
using T reesei strains VTT-D-98693 (pMQ103) and VTT-D-98691 (pMQ113), 
10 respectively, were separated in small scale ATPS as described above. 

The phases from the two phase separations were analysed qualitatively by using SDS- 
PAGE gels followed by visualization of the fusion proteins with Coomassie brilliant blue 
or Western blotting. Coomassie stained SDS-PAGE (10%) is shown in Figure 8, In the 

15 lane containing the non-extracted culture filtrate three distinct closely migrating bands 
can be seen (the sample was diluted 1/10 with H^O). The topmost band is CBHI, the 
band in the middle is EGIcore-HFBI fusion and the lower one endogenous EGL In the 
samples separated in ATPS, only two bands (CBHI and EGI) are seen in the sample 
from bottom phase and one band representing EGIcore-HFBI in the sample obtained 

20 from the top phase. 

Western blotting with HFBI antibody showed thick bands for the top phase, whereas for 
the bottom phase there was only faint band demonstrating that the fusion is separating 
strongly into the detergent top phase. Figure 9 shows the separation of the EGIcore- 
25 HFBI fusion produced on cellulose media into the top phase. Contaminating endogenous 
EGI and EGIII in the top phase was tested with appropriate antibodies but no signal was 
detected. 

Small amount of endogenous CBHI was found in the upper phase when CBHI antibody 
30 was used in Western blotting. EGI, EGIII and proteases were not found in the top phase. 
Further purification from the contaminating CBHI was observed when the top phase was 
re-extracted with 2 % detergent. The Figure 10 shows that the upper phase does not 
any more contain CBHI and pure fusion protein is recovered. 
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EGIcore-HFBI was also produced in fermenter (15 1) using the Rut-C30 strain VTT- 
D-99702 (pMQll3) witli 4 % lactose medium. The separation in ATPS carried out in 
the standard manner gave essentially the same result as the separation from cellulose 
containing medium thus demonstrating that the purification can be carried out from 
5 several media relevant for large scale industrial use. 



Acidic protease activity in the top was only 1/15 compared to the bottom phase (table 
below) demonstrating that acidic proteases remain in the bottom phase. 





A (275 nm) 


HUT'/mi 


Bottom phase* 


0.146 


4L6 


Top phase^ 


0.009 


2.6 



^1/10 diluted bottom phase after separation of VTT-D-98691 culture filtrate with 2% detergent 
15 ^1/100 diluted bottom phase after separation of VTT-D-98691 culture filtrate with 2% detergent 

^1 HUT = enzyme concentration, which in reaction conditions hydrolyses hemoglobin in 1 min so that the 
absorbance at 275 nm of the formed hydrolysate equals 1.10 /<g tyrosine/ml 0.006 N HQ solution. 

These results show that the fusion protein can be purified extremely efficiently and the 
20 resulting prcparate is free of other proteins produced by the fungus including proteases. 

Example 7 

Recovery of the native EGI in ATPS after thrombin cleavage 

25 EGl-HFBl protein produced by the strain VTT-D-98693 has a thrombin cleavage site 
(LVPRGS) designed in the linker region between the EGI CBD and HFBI, which would 
enable the recovery of the native EGI after thrombin cleavage. EGl-HFBI fusion protein 
was purified from the culture filtrate (100 ml) of strain VTT-D-98693 grown on 4% 
Solka flock cellulose and 2 % spent grain as described in Example 4 using the 2-phase 

30 separation system (5 % detergent). After removal of the bottom phase the detergent 
phase was extracted by isobutanol. The resulting water phase ("19 ml) was divided in 
eppendorf tubes and the liquid was evaporated with speed vac. Remaining lyophilizate 
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was diluted to 50 mM Tris-Cl (pH 8). To test the efficiency of thrombin cleavage, 9 
units of thrombin (Sigma) was incubated > 24 h with 1 mg EGI-HFBI fusion protein . 
in 36 °C at pH 8.0. Coomassic stained SDS-PAGE (10 %) was used for detection. 

5 Only minor cleavage was observed in 48 h under these conditions (Figure 11), possibly 
due to steric hindrance by O-glycosylation in the linker. 

Example 8 

Separation of low concentrations of EGIcore-HFBI in ATPS 

10 

Detergent based aqueous two-phase systems were successfully applied using very low 
concentrations (diluted) of EGIcore-HFBI fusion protein produced with the cbhl 
promoter in T reesei VTT-D-98691 (pMQ 113) from a 15 liter cultivation carried out 
on Solka flock cellulose with spent grain as described in Example 4. 

15 

The original protein concentration of the supernatant was 7.0 mg/ml. This supernatant 
was diluted with de-ionised water by a factor of 100 and 1000, respectively. The fusion 
protein could be separated using 2% (w/w) of the detergent C12-C18E05 with 
partitioning coefficients higher than 5. This is shown in the table below together with 
20 the experiment with non-diluted supernatant. The partitioning coefficients were 
calculated based on activity measurements for total EGI (wild type and fusion protein 
together). 





EGIcore-HFBI 
non-diluted supernatant 


dilution 1/100 


dilution 1/1000 


K 


4.1 


5.3 


5.6 


Y[%] 


38 


31 


32 
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Example 9 

Separation of EGlcore-HFBI from fungal biomass containing culture broths 



EGIcorc-HFBI from T.reesei strain VTT-D-98691 (pMQ 113-2) was cultivated (50 
5 ml in 250 ml shake flasks) on Solka flock cellulose with spent grain as described in 
Example 4. Directly after the cultivation, part of the whole broth was centrifuged at 
3000 rpm for 30 min, supernatant was spilled out and the centrifuged mycelium was 
added to the supernatant to obtain artificial whole broths containing different amounts 
of biomass. 



Using 5% of C12-C18E05 in a 10 g experiment consisting of up to 50% wet biomass 
(weight of wet biomass divided by the sum of wet biomass and supernatant) could still 
be separated without any difficulties. The Yield remained in between 61 and 64 % and 
therefore it is not significantly different in comparison to the experiment carried out with 
15 supernatant only (without mycelium) (see table below). The total recovery of the 
fusion protein is even higher. This is most probably due to cell attached enzyme 
extracted in the ATPS increasing the total amount of EGI. The partitioning coefficient 
was calculated based on activity measurements for total EGI (wild type and fusion 
protein together). 



10 



20 



K 



Y[%] 



supernatant 



5.5 



62 



25% of wet biomass in 
supernatant 



7.3 



66 



25 



40% of wet biomass in 
supernatant 



6.4 



61 



50% of wet biomass in 
supernatant 



7.6 



64 



30 
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Example 10 

Separation of EGI-HFBI in ATPS 

EGI-HFBI from Trichoderma reesei strain VTT-D-98693 (pMQ 103) from a 15 liter 
5 cultivation carried out on Solka flock cellulose and spent grain as described in Example 
4 was separated in a 10 g experiment using different amounts of C12-C18E05. The 
partitioning coefficients are shown below. The partitioning coefficient was calculated 
based on activity measurements for total EGl (wild type and fusion protein together), 
and as in previous examples the endogenous EGl is included in the partitioning 
10 coefficients. 



Detergent [% w/w] 


2 


3 


5 


7 


K 


1.9 


1.8 


1.4 


1.1 
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Example 11 

Separation of EGIcore-HFBI in 50 ml 

EGIcore-HFBI from T. reesei strain VTT-D-98691 (pMQ 113) cultivated in 15 liters 
20 using Solka flock cellulose and spent grain as described in Example 4 was separated in 
Falcon tubes in a 50 g experiment using 5% of C12-C18E05. A partition coefficient 
of 2.52 and a yield of 51 % could be obtained. The separation was performed at 30°C 
at 3000rpm for 30 minutes. The values are based on activity measurements for total EGl 
activity (wild type and fusion protein together) including endogenous EGL 

25 

Example 12 

Separation of EGIcore in ATPS using different detergents 

EGIcore-HFBl from T reesei strain VTT-D-98691 (pMQ 113) cultivated in 15 liters 
30 using Solka flock cellulose with spent grain as described in Example 4 was separated 
in a 10 g experiment using 2% of detergent in each experiment. The detergents 
investigated in this example were ClO E05, Ci2 E05, C14 E06 (each Nikko 



I 
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Chemicals, Japan), C12-C18 E05 („AgrimuI NRE 1205", Henkel, Germany), C12/14 
5EO, C12/14 6E0 ( Clariant, Germany), C9/U E05.5 („Berroi 266", Akzo Nobel. 
Germany), Triton X-n4 (Sigma, Germany). The partition coefficients and yields are 
listed below. The values are based on activity measurements for total EGI activity (wild 
5 type and fusion protein together) including endogenous EGI. 





K 


Y(fusion) [%] 


C10EO5 


20 


56 


C12E05 


15 


57 


C12-C18E05 


14 


66 


C12/14 5EO 


12 


58 


C12/14 6EO 


14 


62 


C14E06 


11 


54 


C9/11 E05.5 


5 


30 


Triton X-114 


0.16 


53 



Example 13 

20 Separation of EGIcore-HFBI in ATPS from glucose grown cultures 

EGIcore-HFBI was separated from cultivation of the Trichoderma reesei strain VTT- 
D-98682 (pMQ115) cultivated with glucose as described in Example 4. The supernatant 
was separated with 2% of the detergent C12-C18 E05. The fusion protein could be 
25 partitioned with a K value of 2.4. In comparison, the K value for the native EGI is 0.3 
when measured in a similar way for purified EGI. 

Example 14 

Separation of EGIcore-HFBI using different concentrations of detergent 

30 

EGIcore-HFBI from T. reesei VTT-D-98691 (pMQ 113) cultivated in 15 litres using 
Solka flock with spent grain as described in Example 4 was separated in detergent based 
ATPS applying different amounts of the detergent C12-C18 E05 on the cell free 
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supernatant. The partitioning coefficients are shown in the table below. The correspon- 
ding gel electrophoresis and Western antibody-blots are shown in Figure 8 and Figure 
9, respectively. 

The values are based on activity measurements of total EGI activity. 



Amount of detergent C12-C18 EOS 


K 


Yicld(%) 


1.0% 


6.1 


9 


2.0% 


4.1 


38 


3.5% 


3.6 


50 


5.0% 


2.9 


55 


7.5% 


1.7 


53 


10.0% 


1,1 


58 



Example 15 
15 Re-extraction of the detergent phase 

Detergent based ATPS was applied on EGlcore-HFBI fusion protein containing 
supernatant produced by the strain VTT-D-98691 (pMQ 113) in a shake flask 
cultivation. The first extraction using C12-C18E05 conducted under the standard 

20 conditions shows a partitioning coefficient of 16 and a yield of 72 % ( wild type EGI 
measured together with fusion protein). The top phase was re-extracted in 10 mM 
sodium acetate buffer (pH 5) with 2% of detergent. A partitioning coefficient of 52 
and a yield of 89 % could be obtained. In the re-extraction experiment of the bottom 
phase (2% of detergent), a small yield of 7.5% and a K of 0.8 of EGI activity were 

25 achieved. The partitioning coefficients were calculated based on activity measurements 
for total EGI (wild type and fusion protein together). Due to the wild type EGI present 
in the sample, the yield is at least 72% and the partitioning coefficient at least 16 in the 
first extraction. The SDS-PAGE results of both extraxtions are shown in Figure 10. 



Separation step 


K 


Y[%] 


2% detergent 


16 


72 


reex tract ion top phase 


52 


89 


reextraction bottom phase 


0.8 


7.5 
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Example 16 

Separation of pure cellulases in ATPS 

The effect of HFBI on partitioning and the final yield can further be demonstrated by 
5 comparing the extraction result of EGlcore-HFB fusion with extraction results obtained 
with purified wild type EGI and EGIcore, The fusion protein is partitioning more than 
100 times better to the detergent phase (see table belovc). 

The improvement on the partitioning of the purified fusion protein from the first 
10 extraction obtained in the re-extraction (see Example 15) can be cxplaned by the 
partitioning of the wild type EGI as demonstrated with purified wild type EGI in the 
table below. The wild type EGI lowers the partitioning coefficient in the first extraction 
(since EGI activity is measured from both top and botom phase), but the absence of it 
in the re-extraction increases the partitioning coefficient of the EGIcore-HFBI fusion. 
15 The purity can in addition be demonstrated by analysing the partitioning of pure CBHl, 
which is the major contaminating protein corresponding to about 50 % of all secreted 
T. reesei proteins. Pure CBHI has a partitioning coefficient of 0.5 and a yield of 3.6 and 
is therefore separated from the fusion protein. 

20 



Separation step 


K 


Y[%] 


re -ex traction of top phase 


52 


89 


extraction of pure wild type EGI 


0.3 


2.2 


extraction of pure EGI-core 


0.3 


2.3 


extraction of pure CBHI 


0.5 


3.6 



Using the definitions of K and Y and calculating mass balances, the ratio of the amount 
of EGI fusion protein to EGI wild type can be calculated. The "true" partition 
30 coefficients and Yields can be concluded from this. "True" means the values which 
would be detected if it would be possible to measure only the amount of EGI-fusion 
without measuring the amount for EGI wild type at the same time. 
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The fundament for the calculation is the re-extraction experiment. The re-extracted top 
phase is believed to be pure. An example of the measured values and the calculated 
"true" values based on this are shown in the table below for two cultivations of VTT- 
D-98691 (pMQ113) grown as described in Example 4, 



10 



cultivation 
vessel 


cultivation 
substrate 


K "with EGl 
wt" 


"true" 
K 


Y[%] 
"with EGl 
wt" 


"true" 
Y[%] 


15 liter 
fcrmcnter 


whey per- 
meate 


4 


6 


16 


54 


250 ml shake 
flask 


cellulose 


16 


54 


66 


90 



15 Example 17 

HFBI and HFBII purification in ATPS 

HFBI was produced by cultivating the T. reesei strain VTT-D-98692 (pEA10-103B) 
using glucose as substrate as described in Example 4. HFBI could be separated using 
20 2% of the detergent C12-C18 E05 with a partition coefficient higher than 20 under the 
standard conditions described. 

HFBII was produced by cultivating the T reesei strain VTT-D-74075 (QM9414) on 
whey spent grain as described in Example 4. HFBII could be separated using 2% of the 
25 detergent C12-C18 E05, exceeding a partition coefficient of 10 under the standard 
conditions. 

Both HFBI and HFBII hydrophobins are thus partitioning well to the upper phase in 
ATPS. 



30 
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Example 18 

Detergent based ATPS with additional NaCI 



EGIcore-HFBI from which cultivation of T.reesei was separated in a 10 g experiment 
5 using 5% of C12-C18E05. The partitioning coefficient of the supernatant was 3.5 with 
a volume ratio of 0.2. Using 1.1 % (w/v) NaCl the partitioning coefficient could be 
increased to 4.3 with a lower volume ratio of 0.14. 

Example 19 

10 Construction of an E. coli strain expressing a fusion protein HFBI-dCBD, 
containing hydrophobin I and double cellulose binding (CBD) domains 

A 280 bp DNA fragment containing a modified cbh2 linker region followed by the 
coding region of hfbl from Ser-23 to the STOP codon was amplified by PCR using the 

15 plasmid pAROl (Nakari-Setala et at., 1996) as a template. The 5' primer was 5' TCT 
AGC AAG CTT GGC TCT ACT TCT GGA ACC GCA CCA GGC GGC AGC 
AAC GGC AAC GGC AAT GTT TGC (SEQ ID 14) and the 3' primer was 5' TCG 
TAG AAG CTT TCA AGC ACC GAG GGC GGT (SEQ ID 15). The sequences in bold 
in the 5' and 3' primers encode the modified CBHII linker (Gly Ser Ser Ser Gly Thr 

20 Ala Fro Gly Gly) and a translational STOP, respectively, and the underlined AAGCTT 
in both primers is a Hindlll site. The PCR fragment was purified from agarose gel, 
digested with Hindlll and ligated to HindlU digested and SAP treated (Shrimp Alkaline 
Phosphatase, USB) pSP73 resulting in plasmid pTNS9. 

25 For subsequent cloning of the modified CBHII linker-HFBI fragment to an £. coli 
expression vector, pTNS9 was digested with Hindlll and the proper fragment was 
purified from agarose gel. This Hindlll fragment was cloned to HindlU digested and 
SAP treated (Shrimp Alkaline Phosphatase, USB) B599 resulting in pTNSl3 (Figure 
12). The E. coli expression vector B599 is essentially the same as the one described by 

30 Under et al. (1996) except that it is missing a STOP codon at the end of the protein 
coding sequence. It carries the coding sequence for a fusion protein containing CBHII 
CBD (41 N-terminal residues of CBHII) and CBHI CBD linked together via CBHI 
linker region (CBHI linker and CBD are the last 57 residues in CBHI). The expression 
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and secretion of the fusion protein in B599 is under the cotrol of tac promoter and pelB 
signal sequence (Takkinen et ai, (1991). pTNS13 expression vector thus carries the 
coding region for a fusion protein of double CBD and HFBI linked in frame via the 
Gly-Ser-Ser-Ser-Gly-Thr-Ala-Pro-Gly-Gly peptide. This vector also contains the 
5 amp gene for selection of coti transformants. pTNS13 plasmid was transformed into 
E. coli strain RV308 (su-, MacX14, ga/ISn::OP308, strA) and this strain was used for 
production of the fusion protein. 

Example 20 

10 Separation of HFBI-dCDB molecules expressed in E.coli in ATPS 

dCBD-HFBI was produced in E. coli strain RV 308 transformed with pTNS13 plasmid 
as described above. The inoculum of RV308/pTNS13 was grown to the exponential 
growth phase in LB medium containing ampicillin (0.1 g/1) and 1% glucose. 

15 Fermentation was carried out using mineral salt medium described by Pack et ai (1993) 
with glucose (feed) in 10 litre fermenter. During cultivation temperature was maintained 
at 28 *'C and pH was controlled at 6.8 with NH4OH. Cell growth was monitored by 
measuring ODgog and dry weight of biomass. The culture was induced with 50 fjM (final 
concentration) IPTG (isopropyl-p-D-thiogalactopyranosidc) at late-exponential growth 

20 phase (ODgoo=50-60) to promote fusion protein production. 

Two-phase separation analysis of dCBD-HFBI protein was performed using culture 
filtrate and 5% detergent in the total volume of 40 ml. Results from Western blotting 
showed that 2-phase separation with 5 % detergent in the standard way was highly 
25 specific also for the dCBD-HFBI fusion. Strong signal was observed in the sample from 
the detergent phase compared to the sample from the bottom phase as shown in Figure 
13. 
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Example 21 

Coastruction of yeast strains expressing HFBI-Fiol fusion protein on the cell 
surface 

5 For construction of a HFBI-FLOl fusion protein expression cassette, hfbl (SEQ ID 1) 
coding region (from Ser-23 to the STOP codon) was amplified with PCR using pAROl 
(Nakari-Setala et ai, 1996) as a template and as a 5' primer TCT AGC TCT AGA 
AGC AAC GGC AAC GGC AAT GIT (SEQ ID 16) and as a 3' primer TGC TAG 
TCG ACC TGC TAG CA G CAC CGA CGG CGG TCT G (SEQ ID 17). The 

10 underlined sequences in the 5' and 3' primers are Xbal and Nhel sites, respectively. 
The 0.225 bp PCR fragment was purified from agarose gel and ligated to pGEM-T 
vector (Promega) resulting in pTNSlO. The hfbl fragment was released from pTNSlO 
with Xbal and Nhel and ligated to pTNSl5 cut with the same restriction enzymes. 
Plasmid pTNSl5 (Figure 14) is essentially the same as plasmid pBR-ADHl-FLOlL by 

15 Watari et ai 1994 except that a Nhel site in the pBR322 backbone has been replaced 
by a Bglll site and a unique Xbal site is introduced by linker cloning in the unique AocI 
site preceding the putative signal sequence cleavage site. The resulting plasmid pTNSlS 
(Figure 15) contains the complete expression cassette for HFBI-FLOl fusion protein in 
which HFBI substitutes the putative lectin domain from Ser-26 to Ser-319 in the yeast 

20 fiocculin FLOl (SEQ ID 18). 

In the next step, yeast expression vector for production of HFBI-FLOl fusion protein 
was constructed. The expression vector used as a backbone in the construct is pYES2 
(Invitrogen) (SEQ ID 19) which is a high-copy episomal vector designed for inducible 

25 expression of recombinant proteins in S. cerevisiae. It carries GALl promoter and CYCl 
terminator sequences which regulate transcription, and 2^ origin of replication and URA3 
gene for maintenance and selection in the host strain. The plasmid pTNSlS was digested 
with Hindu I and the released 3.95 kb fragment containing the expression cassette for 
HFBI-FLOl was purified from agarose gel and ligated to pYES2 digested with Hindlll. 

30 This ligation mixture was concentrated by standard cthanol precipitation. The ligation 
mixture should contain besides unligated fragments and uncorrect ligation products also 
molecules where the vector and insert are correctly ligated with each other to result in 
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plasmid pTNS23 (Fig. 16) which carries the expression cassette for HFBI-FLOl 
operably linked to G.iLl and CYCl terminator sequences. 

The above ligation mixture was transformed using the LiAc method of Gietz et at 
(1992) into a laboratory 5. cerevisiae strain H452 (wild type W303-1A; Thomas and 
Rothstein, 1989). Transformant colonies able to grow on SC-URA plates were picked 
and streaked on selective plates. Nitrocellulose replicas were taken from the plates and 
treated for colony hybridization according to Sherman et al. (1983). To find those yeast 
colonies containing the pTOS23 plasmid, replicas were hybridized with digoxigenin 
labelled hfbl coding fragment after which an immunological detection was performed 
all according to the manufacturer (Boehringer Mannheim). Plasmids were recovered 
from several yeast colonies giving posttivie hybridization signal by isolating total DNA 
and using this in elcctroporation of E. coli. Restriction mapping and sequencing were 
carried out to confirm that the pTNS23 plasmid in the yeast transformants was correct. 
One of the transformants carrying plasmid pTNS23 was chosen for further studies and 
was designated VTT-C-99315. The control strain for it is yeast strain H:2155 which 
carries the plasmid pYES2 in H452 background. 

Example Z2 

Separation of yeast cells expressing HFBI-FIoI fusion protein in ATPS 

The Saccharomyces cerevisiae strain VTT-C-99315 (vector pTNS23) and its control 
strain H2155 (vector pYES2) were cultivated on synthetic complete medium lacking 
uracil (SC-URA) (Sherman, 1991) with 2% galactose as the carbon source to give an 
A^ of approximately 4. Approximately 6.3 x 10^ cells in their culture medium were 
taken to ATPS using 7% (w/v) C12-18E05 detergent (Agrimul NRE from Henkcl) in 
a total volume of 5 ml. ATPS was carried out using strandard protocol. After phase 
separation by gravity settling, the top detergent phase was clearly turbid in the case of 
the strain VTT-C-99315 in contrast to the control strain whose detergent phase was 
clear (Figure 17). Samples were taken from the top phases and dilution series from 10"^ 
to 10"^ were prepared in 0.9% NaCl and plated on YPD plates. After incubation at 30°C 
the amount of yeast colonics were calculated showing at least 70 times more yeast 
colonies of the strain VTT-C-99318 on YPD plates compared to the control strain. This 
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clearly manifests that also in a system over-loaded with cells, separation to the detergent 
phase of cells expressing a hydrophobin on cell surface occurs. 



Example 23 

5 Production of EGIcore-HFBI fusion proteins in T. reesei Ishfhl strain for improved 
partitioning of the fusion protein in ATPS 

Trichoderma reesei strain QM9414 dJifb2 (VTT-D-99726) was transformed essentially 
as described (Penttila et al, 1987) using 10 //g of the plasmid pMQ113 together with 
10 3 fig of the selection plasmid pTOC202 containing the amdS gene (Hynes et aL, 1983); 
Tilbum e( ai, 1983) of Aspergillus nidulans encoding for acetamidase. pMQH3 
contains an expression cassette for production of EGtcore-HFBl fusion protein under 
the control of cbhl promoter and terminator sequences. 

15 The Amd+ transformants obtained were streaked two times onto plates containing 
acctamide (Penttila et ai, 1987). Thereafter spore suspensions are made from 
transformants grown on Potato Dextrose agar (Difco). The production of the EGlcore- 
HFBI fusion protein is tested by slot blotting or Western analysis with EGI and HFBI 
specific antibodies from shake flask or microtiter plate cultivations carried out in 

20 minimal medium supplemented with a mixture of Solka flock cellulose and/or spent 
grain and/or whey. The spore suspensions of the clones producing fusion protein are 
purified to single spore cultures on selection plates (containing acetamide). To determine 
the best producers, production of the fusion protein is analyzed again from these purified 
clones as described above. 

25 

For partitioning experiments of the EGIcore-HFBI fusion protein in ATPS using the 
poly oxy ethylene detergent Ci2_isE05 (Agrimul NRE 1205, Henkel) the best production 
strain obtained in this study and as control strains VTT-D-98691 (QM94i4 strain 
producing EGIcore-HFBI) and VTT-D-74075 (QM9414) are cultivated at 28°C in 
30 shake flasks for 5 to 6 days in 50 to 250 ml volume of Trichoderma minimal medium 
(Penttila etai^ 1987) supplemented with 3% Solka flock cellulose and 1% spent grain. 
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Standard partitioning experiments as described in Example 5 are carried out with culture 
supernatants (biomass separated by centrifugation or filtration). After separation the 
volume ratio of the lighter and heavier phase is noted and the concentration factor for 
the fusion protein is calculated from it. Samples are also taken from the lighter and 
heavier phase and analysed with SDS-PAGE, Western blotting and activity measure- 
ments as described in Example 5. Partition coefficients (K) and yields (Y) are calculated 
as described in Example 5. 

Example 24 

Production of HFBI-single chain antibody fusion protein in T. reesei dJifbl strain 
for purification in ATPS 

A T. reesei strain is constructed which produces a fusion protein consisting of T reesei 
HFBI protein in the N-terminus and in the C-terminus a single chain antibody 
recognizing a small molecular weight derivative of diarylalkyltriazole (ENA5SCFV). 
Production of the fusion protein is under the cbhl regulatory sequences. The fusion is 
to be subjected for further purification using aqueous two-phase system. 

For construction of HFBI-ENA5SCFV fusion protein, pENASSCFV was digested with 
Ncol and Xbal. The fragment containing the ena5scjv gene and the histidine tail (6 x 
His) was blut-end cloned to pTNS29 resulting in PTHl. pENASSCFV vector carries the 
coding region for ENA5 single chain antibody consisting of the variable domains of the 
heavy and light chains connected via a glycine serine linker and a 6 x histidine tag at 
the C-terminal end. Transcription and secretion of the single chain antibody are under 
control of the (ac promoter and pelB signal sequence, respectively (Takkinen et ai, 
1991). pTNS29 vector carries the hfbl coding region of T. reesei followed by a linker 
sequence (ProGlyAlaSerThrSerThrGlyMctGlyProGlyGly) under the control of cbhl 
promoter and terminator sequences. 

For construction of HFBI-ENA5SCFV fusion protein with a thrombin cleavage site in 
the linker peptide, enaSscfv coding region (from Ala-23 to the STOP codon) and a 
peptide linker containing the thrombin cleavage site (Gly Thr Leu Val Pro Arg Gly Pro 
Ala Glu Val Asn Leu Val) preceeding it was amplified with PGR using pENA5SCFV 
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as a template and as a 5' primer GAA TTC GGTACC CTC GTC CCT CGC GGT CCC 
GCC GAA GTG AAC CTG GTG and as a 3' primer TGA ATT CCA TAT OCT AAC 
CCC GTT TCA TCT CCA G. The sequence in bold in the 5' primer encodes the first 
6 N-terminal residues of ENA5SCFV, The sequence in italics is a thrombin cleavage 
5 site and underlined GGT ACC is Asp7l8 site. The sequence in bold in the 3' primer 
encodes the 6 C-terminal residues of ENA5SCFV and the underlined CA TATG is a 
Ndel site. The 790 bp PGR fragment was purified from agarose gel and ligated to 
pTKS29 resulting in pTH2. 

10 Trichoderma reesei strain VTT-D-99726 (QM9414 hhfb2) is co- trans formed essentially 
as described (Penttila et ai, 1987) using 10 //g of the plasmids pTHl and PTH2 and as 
selection plasm id 2 fig pTOC202. Amd+ trans form ants obtained are streaked two times 
onto plates containing acctamide. Thereafter spore suspensions are made from 
transformants grown on Potato Dextrose agar (Difco). 

15 

The production of the two HFBf-ENA5SCFV fusion proteins is tested by Western 
analysis with HFBI specific antibody and with the antibody against the his-tail from 
shake flask cultivations carried out in mimal medium supplemented with 3 % lactose 
or Solka flock cellulose and spent grain. 

20 

Partitioning experiments of the HFBI-ENA5 fusion proteins in ATPS using the 
polyoxyethylene detergent Cja.igEO^ (Agrimul NRE 1205, Henkel) with the supematants 
of the best production strains obtained in this study, and the control strain VTT-D- 
99726 (QM9414 Ahfb2) are carried out and analysed as described in Example 5. 
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Sequences 



SEQ ID 1: Coding sequence of hjbl , 428 bp, introns underlined. 



ATGAAGTT CTTCGCCATC GCCGCTCTCT TTGCCGCCGC 

TGCCGTTGCC CAGCCTCTCG AGGACCGCAG CAACGGCAAC GGCAATGTTT GCCCTCCCGG 

CCTCTTCAGC AACCCCCAGT GCTGTGCCAC CCAAGTCCTT GGCCTCATCG GCCTTGACTG 

CAAAGTCC GT AAGTTGAGCC ATAACATAAG AATCCTCTTG ACGGAAATAT GCCTTCTCAC 

TCCTTTACCC CTGAACAG CC TCCCAGAACG TTTACGACGG CACCGACTTC CGCAACGTCT 

GCGCCAAAAC CGGCGCCCAG CCTCTCTGCT GCGTGGCCCC CGTT GTAAGT TGATGCCCCR 

GCTCAAGCTC CAGTCTTTGG CAAACCCATT CTGACACCCA GACTGCAG GC CGGCCAGGCT 
CTTCTGTGCC AGACCGCCGT CGGTGCTTGA 



SEQ ID 2 



TCG GO C ACT ACG TG C CAG TAT AGC AAC GAC TAG TAG TCG CAA TGG CTT GTT 

CCG CGT GGC TCT AGT TOT GGA ACC GCA 



SEQ ID 3 



TCG TAG GGATCC TCA AGC ACC GAC GGC GGT 



SEQ ID 4 



ACT ACA CGG AG G AGC TC G ACG ACT TCG AGC AGC CCG AGC TGC ACG CAG AGC 

AAC GGC AAC GGC 



SEQ ID 5 : T. jreesei chhl promoter, 2211 bp 



GAATTCTCAC GGTGAATGTA GGCCTTTTGT AGGGTAGGAA TTGTCACTCA AGCACCCCCA 
ACCTCCATTA CGCCTCCCCC ATAGAGTTCC CAATCAGTGA GTCATGGCAC TGTTCTCAAA 
TAGATTGGGG AGAAGTTGAC TTCCGCCCAG AGCTGAAGGT CGCACAACCG CATGATATAG 
GGTCGGCAAC GGCAAAAAAG CACGTGGCTC ACCGAAAAGC AAGATGTTTG CGATCTAACA 
TCCAGGAACC TGGATACATC CATCATCACG CACGACCACT TTGATCTGCT GGTAAACTCG 
TATTCGCCCT AAACCGAAGT GCGTGGTAAA TCTACACGTG GGCCCCTTTC GGTATACTGC 
GTGTGTCTTC TCTAGGTGCA TTCTTTCCTT CCTCTAGTGT TGAATTGTTT GTGTTGGGAG 
TCCGAGCTGT AACTACCTCT GAATCTCTGG AGAATGGTGG ACTAACGACT ACCGTGCACC 
TGCATCATGT ATATAATAGT GATCCTGAGA AGGGGGGTTT GGAGCAATGT GGGACTTTGA 
TGGTCATCAA ACAAAGAACG AAGACGCCTC TTTTGCAAAG TTTTGTTTCG GCTACGGTGA 
AGAACTGGAT ACTTGTTGTG TCTTCTGTGT ATTTTTGTGG CAACAAGAGG CCAGAGACAA 
TCTATTCAAA CACCAAGCTT GCTCTTTTGA GCTACAAGAA CCTGTGGGGT ATATATCTAG 
AGTTGTGAAG TCGGTAATCC CGCTGTATAG TAATACGAGT CGCATCTAAA TACTCCGAAG 
CTGCTGCGAA CCCGGAGAAT CGAGATGTGC TGGAAAGCTT CTAGCGAGCG GCTAAATTAG 
CATGAAAGGC TATGAGAAAT TCTGGAGACG GCTTGTTGAA TCATGGCGTT CCATTCTTCG 
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ACAAGC/VAAG CGTTCCGTCG 
TAAGTAGCGA TGGAACCGGA 
CAATGCAGGG GTACTGAGCT 
GGCGTTTCCC TGATTCAGCG 
GGACGTGTTT TGCCCTTCAT 
TGACCGACTG GGGCTGTTCG 
AGGCATGTTG TGAATCTGTG 
CACCGATAGC AGTGTCTAGT 
CAAACCAATG GCTAAAAGT^ 
TAATTGTACA ATCAAGTGGC 
AAGCAACGGC AAAGCCCACT 
GATCCCCCAA TTGGGTCGCT 
GTCTGACTCG GAGCGTTTTG 
ACATTCAAGG AGTATTTAGC 
CGATACGACG AATACTGTAT 
GGCACTGAAC AGGCAAAAGA 
GGCTTTGGGT GTACATGTTT 
TGCTGCCTTT ACCAAGCAGC 
GGTTTCGAAT AGAAAGAGAA 
AACGAAATGA GCTAGTAGGC 
GCCTCCCTCA TGCTCTCCCC 
CATCTTTTGA GGCACAGAAA 



CAGTAGCAGG CACTCATTCC 
ATAATATAAT AGGCAATACA 
TGGACATAAC TGTTCCGTAC 
TACCCGTACA AGTCGTAATC 
TTGGAGAAAT AATGTCATTG 
AAGCCCGAAT GTAGGATTGT 
TCGGGCAGGA CACGCCTCGA 
AGCAACCTGT AAAGCCGCAA 
CATAAGTTAA TGCCTAAAGA 
TAAACGTACC GTAATTTGCC 
TCCCACGTTT GTTTCTTCAC 
TGTTTGTTCC GGTGAAGTGA 
CATACAACCA AGGGCAGTGA 
CAGGGATGCT TGAGTGTATC 
AGTCACTTCT GATGAAGTGG 
TTGAGTTGAA ACTGCCTAAG 
GTGCTCCGGG CAAATGCAAA 
TGAGGGTATG TGATAGGCAA 
GCTTAGCCAA GAACAATAGC 
AAAGTCAGCG AATGTGTATA 
ATCTACTCAT CAACTCAGAT 
CCCAATAGTC AACCGCGGAC 



CGTVAAAAACT CGGAGATTCC 
TTGAGTTGCC TCGACGGTTG 
CCCACCTCTT CTCAACCTTT 
ACTATTAACC CAGACTGACC 
CGATGTGTAA TTTGCCTGCT 
TATCCGAACT CTGCTCGTAG 
AGGTTCACGG CAAGGGAAAC 
TGCAGCATCA CTGGAAAATA 
AGTCATATAC CAGCGGCTAA 
AACGCGTTGT GGGGTTGCAG 
TCAGTCCAAT CTCAGCTGGT 
AAGAAGACAG AGGTAAGAAT 
TGGAAGACAG TGAAATGTTG 
GTGTAAGGAG GTTTGTCTGC 
TCCATATTGA AATCTAAGTC 
ATCTCGGGCC CTCGGGCTTC 
GTGTGGTAGG ATCGACACAC 
ATGTTCAGGG GCCACTGCAT 
CGATAAAGAT AGCCTCATTA 
TATAAAGGTT CGAGGTCCGT 
CCTCCAGGAG ACTTGTACAC 
TGCGCATCAT G 



SEQ ID 6: reesei egll cDNA, 

CCCCCCTATC TTAGTCCTTC TTGTTGTCCC 
ACCACGGCCA TCCTGGCCAT TGCCCGGCTC 
CCCGAGGTCC ATCCCAAGTT GACAACCTAC 
CAGGACACCT CGGTGGTCCT TGACTGGAAC 
TCGTGCACCG TCAACGGCGG CGTCAACACC 
AAGAACTGCT TCATCGAGGG CGTCGACTAC 
AGCCTCACCA TGAACCAGTA CATGCCCAGC 
CGGCTGTATC TCCTGGACTC TGACGGTGAG 
CTGAGCTTCG ACGTCGACCT CTCTGCTCTG 
TCTCAGATGG ACGAGAACGG GGGCGCCAAC 
AGCGGCTACT GCGATGCTCA GTGCCCCGTC 
AGCCACCAGG GCTTCTGCTG CAACGAGATG 
GCCTTGACCC CTCACTCTTG CACGGCCACG 
CCCTATGGCA GCGGCTACAA AAGCTACTAC 
ACCTTCACCA TCATCACCCA GTTCAACACG 
AGCATCACCC GCAAGTACCA GCAAAACGGC 
GACACCATCT CGTCCTGCCC GTCCGCCTCA 
GCCCTGAGCA GCGGCATGGT GCTCGTGTTC 
AACTGGCTCG ACAGCGGCAA CGCCGGCCCC 
ATCCTGGCCA ACAACCCCAA CACGCACGTC 
GGGTCTACTA CGAACTCGAC TGCGCCCCCG 
ACTACACGGA GGAGCTCGAC GACTTCGAGC 
CAGTGCGGTG GCATTGGGTA CAGCGGGTGC 
TATAGCAACG ACTACTACTC GCAATGCCTT 
GACGGGGGCA CGATAGAATG CGGGCACGCA 
AAGACATGCT ATGTTGTATC TACATTAGCA 
AGCAAAAAAA AAAAAAAAAA AAAAAAAA 



1588 bp 

AAAATGGCGC CCTCAGTTAC ACTGCCGTTG 
GTCGCCGCCC AGCAACCGGG TACCAGCACC 
AAGTGTACAA AGTCCGGGGG GTGCGTGGCC 
TACCGCTGGA TGCACGACGC AAACTACAAC 
ACGCTCTGCC CTGACGAGGC GACCTGTGGC 
GCCGCCTCGG GCGTCACGAC CTCGGGCAGC 
AGCTCTGGCG 6CTAGAGCAG CGTCTCTCCT 
TACGTGATGC TGAAGCTCAA CGGCCAGGAG 
CCGTGTGGAG AGAACGGCTC GCTCTACCTG 
CAGTATAACA CGGCCGGTGC CAACTACGGG 
CAGACATGGA GGAACGGCAC CCTCAACACT 
GATATCCTGG AGGGCAACTC GAGGGCGAAT 
GCCTGCGACT CTGCCGGTTG CGGCTTCAAC 
GGCCCCGGAG ATACCGTTGA CACCTCCAAG 
GACAACGGCT CGCCCTCGGG CAACCTTGTG 
GTCGACATCC CCAGCGCCCA GCCCGGCGGC 
GCCTACGGCG GCCTCGCCAC CATGGGCAAG 
AGCATTTGGA ACGACAACAG CCAGTACATG 
TGCAGCAGCA CCGAGGGCAA CCCATCCAAC 
GTCTTCTCCA ACATCCGCTG GGGAGACATT 
CCCCCGCCTG CGTCCAGCAC GACGTTTTCG 
AGCCCGAGCT GCACGCAGAC TCACTGGGGG 
AAGACGTGCA CGTCGGGCAC TACGTGCCAG 
TAGAGCGTTG ACTTGCCTCT GGTCTGTCCA 
GGGAGCTCGT AGACATTGGG CTTAATATAT 
AATGACAAAC AAATGAAAAA GAACTTATCA 



SEQ ID 7: T. reesei cbhl terminator, 74 5 bp 



GGACCTACCC AGTCTCACTA CGGCCAGTGC 

TGCGCCAGCG GCACAACTTG CCAGGTCCTG 

TCCGTGCGAA AGCCTGACGC ACCGGTAGAT 

GGAGCTACAT GGCCCCGGGT GATTTATTTT 

ATACGGTCAA CTCATCTTTC ACTGGAGATG 

TTGGCAAATT GTGGCTTTCG AAAACACAAA 

TAACGGAATA GAAGAAAGAG GAAATTAAAA 

CCCGTAGAAT CGCCGCTCTT CGTGTATCCC 

CAATGTTGAT ATTGTTCCGC CAGTATGGCT 

CGAACGCGGT AGTGGCTGCT GCCAATTGGT 



GGCGGTATTG GCTACAGCGG CCCCACGGTC 
AACCCTTACT ACTCTCAGTG CCTGTAAAGC 
TCTTGGTGAG CCCGTATCAT GACGGCGGCG 
TTTTGTATCT ACTTCTGACC CTTTTCAAAT 
CGGCCTGCTT GGTATTGCGA TGTTGTCAGC 
ACGATTCCTT AGTAGCCATG CATTTTAAGA 
AAAAAAAAAA AACAAACATC CCGTTCATAA 
AGTACCACGT CAAAGGTATT CATGATCGTT 
CCACCCCCAT CTCCGCGAAT CTCCTCTTCT 
AATGACCATA GGGAGACAAA CAGCATAATA 
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GCAACAGTGG AAATTAGTGG CGCAATAATT GAGAACACAG TGAGACCATA GCTGGCGGCC 
TGGAAAGCAC TGTTGGAGAC CAACTTGTCC GTTGCGAGGC CAACTTGCAT TGCTGTCAAG 
ACGATGACAA CGTAGCCGAG GACCC 

SEQ ID 8 



TAA CCG CGG T 



SEQ ID 9 



CTA GAC CGC GGT TAA T 



SEQ ID 10: r. reesei gpdl promoter 
GTCGAC 

ACGATATACA GGCGCGGCTG ATGATAATGA TGATCGAGCA TGACTTGATG 
CTGTATGTGA CAATATTGAC TGCGAGGAAC CATCAGGTGT GTATGGATGG 
AATCATTCTG TAACCACCAA GGTGCATGCA TCATAAGGAT TCTCCTCAGC 
TCACCAACAA CGAACGATGG CCATGTTAGT GAAGGCACCG TGATGGCAAG 
ATAGAACCAC TATTGCATCT GCGCTTCCCA CGCACAGTAC GTCAAGTAAC 
GTCAAAGCCG CCCTCCCGTA ACCTCGCC CG TTGTTGCTCC CCCCGATTGC 
CTCAATCACA TAGTACCTAC CTATGCATTA TGGGCGGCCT CAACCCACCC 
CCCCAGATTG AGAGCTACCT TACATCAATA TGGCCAGCAC CTCTTCGGCG 
ATACATACTC GCCACCCCAG CCGGCGCGAT TGTGTGTACT AGGTAGGCTC 
GTACTATACC AGCAGGAGAG GTGCTGCTTG GCAATCGTGC TCAGCTGTTA 
GGTTGTACTT GTATGGTACT TGTAAGGTGG TCATGCAGTT GCTAAGGTAC 
CTAGGGAGGG ATTCAACGAG CCCTGCTTCC AATGTCCATC TGGATAGGAT 
GGCGGCTGGC GGGGCCGAAG CTGGGAACTC GCCAACAGTC ATATGTAATA 
GCTCAAGTTG ATGATACCGT TTTGCCAGAT TAGATGCGAG AAGCAGCATG 
AATGTCGCTC ATCCGATGCC GCATCACCGT TGTGTCAGAA ACGACCAAGC 
TAAGCAACTA AGGTACCTTA CCGTCCACTA TCTCAGGTAA CCAGGTACTA 
CCAGCTACCC TACCTGCCGT GCCTACCTGC TTTAGTGTTA ATCTTTCCAC 
CTCCCTCCTC AATCTTCTTT TCCCTCCTCT CCTCTTTTTT TTTTCTTCCT 
CCTCTTCTTC TCCATAACCA TTCCTAACAA CATCGACATT CTCTCCTAAT 
CACCAGCCTC GCAAATCCTC AGTTTGTATG TACGTACGTA CTACAATCAT 
CACCACGATC GTCCGCCCGA CGATGCGGCT TCTGTTCGCC TGCCCCTCCT 
CTCACTCGTG CCCTTGACGA GCTAGCCCCG CCAGGACTCT CCTGCGTCAC 
CAATTTTTTT CCCTATTTAC CCCTCCTCCC TCTCTCCCTC TCGTTTCTTC 
CTAACAAACA ACCACCACCA AAATCTCTTT GGAAGCTCAC GACTCACGCA 
GCTCAATTC GCAGATACAA ATCTAGA 

SEQ ID 11: T. reesei gpdl terminator 



GGATCC CGAGCATT GTCTATGAAT GCAAACAAAA ATAGTAAATA AATAGTAATT 
CTGGCCATGA CGAATAGAGC CAATCTGCTC CACTTGACTA TCTTGTGACT 
GTATCGTATG TCGAACCCTT GACTGCCCAT TCAAACAATT GTAAAGGAAT 
ATAGCTACAA GTTATGTCTC ACGTTTGCGT GCGAGCCCGT TTGTACGTTA 
TTTTGAGAAA GCGTTGCCAT CACATGCTCA CAGTCACTTG GCTTACGATC 
ATGTTTGCGA TCTTCGGTAA GAATACACAG AGTAACGATT ATCTCCATCG 
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CTTCTATGAT TAGGTACTCA GACAACACAT GGGAAACAAG ATAACCATCG 
CATGCAAGGT CGATTCCAAT CATGATCTGG ACTGGGGTAT TCCATCTAAG 
CCATAGTACC CTCGAGAGAA GGAATGGTAG GACCTCTCAG GCGTCCACCA 
TCTGTGCTGC AAATCCAAGA AACCCCCCAA AAGCACCTAC CTATCTACCT 
AGAGTAACTG CACGAGAAAA GAAAAGGAGC AGAAGAAGAA TGATCTCAAG 
AGGCCGTGAA CGCAGAAACA CACTCCTCCC AACTTTTCAA GTTTTGAACA 
AAAAAAGAAA GATGAGGACT AGAAGATGGA GTATTTCCTT CTTAGAGAGC 
TCTCGGTGAG GTGACCTGTC AGGGTTTACC GCAAACCGTC GGTGGTTCTA 
TCCAATTAAT CAAGTCCCGC GCCTCGCCTC TTCTCTCCTG TCCTTTCATA 
GAATCCCGTC TCCTTGTTGC TTGATCGAAG CGGGGTTATC GACGCCACCA 
AAGATCTTGT CTTGGTGACT TATCAATCCT TTGGTGATCA AACAGCCCCC 
GAGTGATCAG ATCCGTAAAA GAAGAAGAAG AGTACGATTT AACCAGACCG 
AGGAACAATA AAGCGAGTAA ATAACATCAA AATAAGAGTC TCGTTGAAAA 
TTACTTGTTC CTCAATCAAT CCCAACCCCC CTAAAAGCCC TTCCCCCCAT 
GGTATATCCC GGCAGTAGGA GAGAGATATT TCCACTACCG CTCACCACCA 
AGTGAGGCT TGCCGAGAGA AGAGGATGAA TCAGAAGTGA CAACAACGGG 
TTGAGCACAT GGGATATC GGCGCGCC 

SEQ ID12: Sequence of plasmid pAN52-l, 57 3 3 bp 



1-2129 Aspergillus nidulans gpdA promoter 

2130-2304 A. nidulans gpdA 

2 3 05-3 071 A. nidulans trpC terminator 

3072-5726 pUClS from Sail to EcoRI 



Sequence 


5733 BP; 


1435 A; 


1454 C; 1378 G; 1463 T; 3 other; 




CAATTCCCTT 


GTATCTCTAC 


ACACAGGCTC 


AAATCAATAA 


GAAGAACGGT 


TCGTCTTTTT 


60 


CGTTTATATC 


TTGCATCGTC 


CCAAAGCTAT 


TGGCGGGATA 


TTCTGTTTGC 


AGTTGGCTGA 


120 


CTTGAAGTAA 


TCTCTGCAGA 


TCTTTCGACA 


CTGAAATACG 


TCGAGCCTGC 


TCCGCTTGGA 


180 


AGCGGCGAGG 


AGCCTCGTCC 


TGTCACAACT 


ACCAACATGG 


AGTACGATAA 


GGGCCAGTTC 


240 


CGCCAGCTCA 


TTAAGAGCCA 


GTTCATGGGC 


GTTGGCATGA 


TGGCCGTCAT 


GCATCTGTAC 


300 


TTCAAGTACA 


CCAACGCTCT 


TCTGATCCAG 


TCGATCATCC 


GCTGAAGGCG 


CTTTCGAATC 


360 


TGGTTAAGAT 


CCACGTCTTC 


GGGAAGCCAG 


CGACTGGTGA 


CCTCCAGCGT 


CCCTTTAAGG 


420 


CTGCCAACAG 


CTTTCTCAGC 


CAGGGCCAGC 


CCAAGACCGA 


CAAGGCCTCC 


CTCCAGAACG 


480 


CCGAGAAGAA 


CTGGAGGGGT 


GGTGTCAAGG 


AGGAGTAAGC 


TCCTTATTGA 


AGTCGGAGGA 


540 


CGGAGCGGTG 


TCAAGAGGAT 


ATTCTTCGAC 


TCTGTATTAT 


AGATAAGATG 


ATGAGGAATT 


600 


GGAGGTAGCA 


TAGCTTCATT 


TGGATTTGCT 


TTCCAGGCTG 


AGACTCTAGC 


TTGGAGCATA 


660 


GAGGGTCCTT 


TGGCTTTCAA 


TATTCTCAAG 


TATCTCGAGT 


TTGAACTTAT 


TCCCTGTGAA 


720 


CCTTTTATTC 


ACCAATGAGC 


ATTGGAATGA 


ACATGAATCT 


GAGGACTGCA 


ATCGCCATGA 


780 


GGTTTTCGAA 


ATACATCCGG 


ATGTCGAAGG 


CTTGGGGCAC 


CTGCGTTGGT 


TGAATTTAGA 


840 


ACGTGGCACT 


ATTGATCATC 


CGATAGCTCT 


GCAAAGGGCG 


TTGCACAATG 


CAAGTCAAAC 


900 
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GTTGCTAGCA 


GTTCCAGGTG 


GAATGTTATG 


ATGAGCATTG 


TATTAAATCA 


GGAGATATAG 


960 


CATGRTCTCT 


AGTT AGCTC A 


^4 "^k ^k. -^k ^ ^k 

CCACAAAAGT 


CAGACGGCGT 


AACCAAAAGT 


CACACAACAC 


1020 


AAGCTG T AAG 


GATTTCGGCA 


CGGCTACGGA 


AGACGGAGAA 


GCCACCTTCA 


GTGGACTCGA 


1080 


GTACCATTTA 


ATT C TAT TTG 


TGTTTGATCG 


AGACCTAATA 


CAGCCCCTAC 


AACGACCATC 


1140 


AAAGTCGTAT 


AGCTACCAGT 


GAGGAAGTGG 


ACTCAAATCG 


ACTTCAGCAA 


CATCTCCTGG 


1200 


ATAAACTTTA 


AGCCTAAACT 


ATACAGAATA 


AGATAGGTGG 


AGAGCTTATA 


CCGAGCTCCC 


1260 


AAATCTGTCC 


AGATCATGGT 


W^.^^ Vk ^P^H ^k^k rites .^^H 

TGACCGGTGC 


CTGGATCTTC 


CTATAGAATC 


ATCCTTATTC 


1320 


GTTGACCTAG 


CTGATTCTGG 


AGTGACCCAG 


AGGGTCATGA 


CTTGAGCCTA 


AAATCCGCCG 


1380 


CCT CC ACC AT 


TTGTAGAAAA 


ATGTGACGAA 


CTCGTGAGCT 


CTGTACAGTG 


ACCGGTGACT 


1440 


CTTTCTGGCA 


TGCGGAGAGA 


CGGACGGACG 


CAGA6AGAAG 


GGCTGAGTAA 


TAAGCCACTG 


1500 


G CC AG AC AGO 


TCTGGCGGCT 


CTGAGGTGCA 


GTGGATGATT 


ATTAATCCGG 


GACCGGCCGC 


1560 


CCCTCCGCCC 


CGAAGTGGAA 


AGGCTGGTGT 


GCCCCTCGTT 


GACCAAGAAT 


CTATTGCATC 


1620 


ATCGG AG AAT 


ATGGAGCTTC 


ATCGAATCAC 


CGGCAGTAAG 


CGAAGGAGAA 


TGTGAAGCCA 


1680 


GGGGTGTATA 


GCCGTCGGCG 


AAATAGCATG 


CCATTAACCT 


AGGTACAGAA 


GTCCAATTGC 


1740 


TTCCGATCTG 


GTAAAAGATT 


CACGAGATAG 


TACCTTCTCC 


GAAGTAGGTA 


GAGCGAGTAC 


1800 


CCGGCGCGTA 


AGCTCCCTAA 


TTGGCCCATC 


CGGCATCTGT 


AGGGCGTCCA 


AATATCGTGC 


1860 


CTCTCCTGCT 


TTGCCCGGTG 


TATGAAACCG 


GAAAGGCCGC 


TCAGGAGCTG 


GCCAGCGGCG 


1920 


CAGACCGGGA 


AGACAAGCTG 


GCAGTCGACC 


CATCCGGTGC 


TCTGCACTCG 


ACCTGCTGAG 


1980 


GTCCCTCAGT 


CCCTGGTAGG 


CAGCTTTGCC 


CCGTCTGTCC 


GCCCGGTGTG 


TCGGCGGGGT 


2040 


TGACAAGGTC 


GTTGCGTCAG 


TCCAACATTT 


GTTGCCATAT 


TTTCCTGCTC 


TCCCCACCAG 


2100 


CTGCTCTTTT 


CTTTTCTCTT 


TCTTTTCCCA 


TCTTCAGTAT 


ATTCATCTTC 


CCATCC7VAGA 


2160 


ACCTTTATTT 


CCCCTAAGTA 


AGTACTTTGC 


TACATCCATA 


CTCCATCCTT 


CCCATCCCTT 


2220 


ATTCCTTTGA 


ACCTTTCAGT 


TCGAGCTTTC 


CCACTTCATC 


GCAGCTTGAC 


TAACAGCTAC 


2280 


CCCGCTTGAG 


CAGACATCAC 


CATGGATCCA 


CTTAACGTTA 


CTGAAATCAT 


CAAACAGCTT 


2340 


GACGAATCTG 


GATATAAGAT 


CGTTGGTGTC 


GATGTCAGCT 


CCGGAGTTGA 


GACAAATGGT 


2400 


GTTCAGGATC 


TCGATAAGAT 


ACGTTCATTT 


GTCCAAGCAG 


CAAAGAGTGC 


CTTCTAGTGA 


2460 


TTTAATAGCT 


CCATGTCAAC 


AAGAATAAAA 


CGCGTTTTCG 


GGTTTACCTC 


TTCCAGATAC 


2520 


AGCTCATCTG 


CAATGCATTA 


ATGCATTGAC 


TGCAACCTAG 


TAACGCCTTN 


CAGGCTCCGG 


2580 


CGAAGAGAAG 


AATAGCTTAG 


CAGAGCTATT 


TTCATTTTCG 


GGAGACGAGA 


TCAAGCAGAT 


2640 


CAACGGTCGT 


CAAGAGACCT 


ACGAGACTGA 


GGAATCCGCT 


CTTGGCTCCA 


CGCGACTATA 


2700 


TATTTGTCTC 


TAATTGTACT 


TTGACATGCT 


CCTCTTCTTT 


ACTCTGATAG 


CTTGACTATG 


2760 


AAAATTCCGT 


CACCAGCNCC 


TGGGTTCGCA 


AAGATAATTG 


CATGTTTCTT 


CCTTGAACTC 


2820 


TCAAGCCTAC 


AGGACACACA 


TTCATCGTAG 


GTATAAACCT 


CGAAATCANT 


TCCTACTAAG 


2880 


ATGGTATACA 


ATAGTAACCA 


TGCATGGTTG 


CCTAGTGAAT 


GCTCCGTAAC 


ACCCAATACG 


2940 


CCGGCCGAAA 


CTTTTTTACA 


ACTCTCCTAT 


GAGTCGTTTA 


CCCAGAATGC 


ACAGGTACAC 


3000 


TTGTTTAGAG 


GTAATCCTTC 


TTTCTAGAAG 


TCCTCGTGTA 


CTGTGTAAGC 


GCCCACTCCA 


3060 


CATCTCCACT 


CGACCTGCAG 


GCATGCAAGC 


TTGGCACTGG 


CCGTCGTTTT 


ACAACGTCGT 


3120 


GACTGGGAAA 


ACCCTGGCGT 


TACCCAACTT 


AATCGCCTTG 


CAGCACATCC 


CCCTTTCGCC 


3180 


AGCTGGCGTA 


ATAGCGAAGA 


GGCCCGCACC 


GATCGCCCTT 


CCCAACAGTT 


GCGCAGCCTG 


3240 


AATGGCGAAT 


GGCGCCTGAT 


GCGGTATTTT 


CTCCTTACGC 


ATCTGTGCGG 


TATTTCACAC 


3300 


CGCATATGGT 


GCACTCTCAG 


TACAATCTGC 


TCTGATGCCG 


CATAGTTAAG 


CCAGCCCCGA 


3360 


CACCCGCCAA 


CACCCGCTGA 


CGCGCCCTGA 


CGGGCTTGTC 


TGCTCCCGGC 


ATCCGCTTAC 


3420 


AGACAAGCTG 


TGACCGTCTC 


CGGGAGCTGC 


ATGTGTCAGA 


GGTTTTCACC 


GTCATCACCG 


3480 


AAACGCGCGA 


GACGAAAGGG 


CCTCGTGATA 


CGCCTATTTT 


TATAGGTTAA 


TGTCATGATA 


3540 


ATAATGGTTT 


CTTAGACGTC 


AGGTGGCACT 


TTTCGGGGAA 


ATGTGCGCGG 


AACCCCTATT 


3600 


TGTTTATTTT 


TCTAAATACA 


TTCAAATATG 


TATCCGCTCA 


TGAGACAATA 


ACCCTGATAA 


3660 
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nor* irn.r' t.r"r 




nii O El T T T r* l^fl 
*VrWrt 4. J. -L V-rVvj 




3720 


ATTCCCTTTT 




1 IGCCl ICC L 


1 1 1 1 IGL* IC 




GC I GG 1 G AAA 


3780 


GTAAAAGATG 


CTG AAG A 1 CA 


GTTGGGTGCA 


CGAG IGGG TT 


AC A J. CG AAC 1 


GGATCTCAAC 


3840 


AGCGGTAAGA 


TCCTTGAGAG 


TTTTCGCCCC 


^ TV A ^ TV TV y"* mm 
GAAGAACGTT 


mmy^i^TV TV fJ^f^ iv T* 

1 TCC AA 1 G A i 


GAGCACTTTT 


3900 


AAAGTTCTGC 


TATGTGGCGC 


/-I y-^ fTl TV ITlfTlTV my^/*^ 

GGTATTATCC 


f^r^ <T» TV fTlfTT^ TV ^"4/^ 

CGTATTGACG 


^^^^ ^ ^ ^ TV ^ ^ 

CCGGGCAAGA 


GCAACTCGGT 


3960 


^^^^^^ Ik m n 

CGCCGCATAC 


ACTATTCTCA 


GAATGACTTG 


GTTGAGTACT 


^ TV TV ^ fn^^ TV 

CACCAGTCAC 


AGAAAAGCAT 


4020 


CTTACGGATG 


GCATGACAGT 


TV 1V ^ fV TV TV rrtrTV TV 

AAGAGAATTA 


TGCAGTGCTG 


TV n% TV TV ^T^^ TV 

CCATAACCAT 


IV nn^^^ TV m Tk ik «^ 

GAGTGATAAC 


4080 


AC TG CGG CCA 


ACTTACTTCT 


Iv TV TV ^^X^ TV -n^^^ 

GACAACGATC 


^ ^ TV ^ ^ TV TV 

GG AG G AC CG A 


TV *i ^ y 1 1 TV TV 

AGGAGCTAAC 


CGCTTTTTTG 


4140 


CACAACATGG 


^ ^ ^ TV TV m^ m 

GGGATCATGT 


TV TV ^fn^y^^^n^m 

AACTCGCCTT 


^ Tvm^^mm^^^ 
GATCGTTGGG 


TV TV ^^^^^1^^ Tl ^^flT 

AACCGGAGCT 


GAATGAAGCC 


4200 


AT AC C AAACG 


TV TV i**^ n^/^ TV 

ACGACrCG I G A 


IV /^/^ TV X <TV^ 

CACCACGATG 


/^T'J^ m TV ^ T\ TV 

CC 1 G I AGCAA 


m^^/^TV TV rf^TV IV ^ 

\1 GGCAACAAC 


GTTGCGCAAA 


4260 


CTATTAACl G 


GCG AAC 1 ACT 


TACT CT AG CT 


T CCCGGCAAC 


AATTAAI AGA 


CTGGATGGAG 


4320 


GCGGATAAAG 


TTGCAGGACC 


ACTTCTGCGC 


TCGGCCCTTC 


CGGCTGGC I G 


GTTTATTGCT 


4380 


G AT AAAT CTG 


G AGCCGG 1 G A 


^ m^ o ^ m^ m 

GCGTGGGTCT 


r^/^i^/**/^mTV m^TV 
CGCGG 1 Al C A 


mm^/^Tv ^ TV 
1 TGCAGCAC i 


^^f^^^ TV Tk m 

GGGGCCAGAT 


4440 


GGTAAGCCCT 


CCCGTATCGT 


TV f ti 11 TV m^fn TV ^ 

AGTTATCTAC 


TV /*T^ TV f^^^ Ti 

ACGACGGGGA 


GTCAGGCAAC 


TV ^1^^ ^1 TV ^n^^ Tk m 

TATGGATGAA 


jV A 

4500 


CGAAATAGAC 


IV o IV m*^r^ r^^P*^ TV 
AGATCGC TGA 


GATAGGTGCC 


TV /-<rp/^ TV mm TV 

ICACIGAITA 


Tvo ^^Tv mm/*' TV 
AG C ATTGG 1 A 


TV ^4 m^ TV ^ TV A 

ACTGTCAGAC 


4560 


CAAGTTTACT 


m TV m TV ti / ii 

CATATATACT 


nvAv TV ^ TV mm^^ ^ m 

TTAGATTGAT 


^nfn TV TV TV TV ^^^nfn^^ 

TTAAAACTTC 


ATTTTTAATT 


TV TV TV TV ^9 ^% Tk. #v% 

TAAAAGGATC 


4620 


T AGG TG AAG A 


TCCTTTTTGA 


rn Tk TV mym^Tv 

TAATCTCATG 


TV ^^TV TV TV TV m^^ 

ACCAAAATCC 


^^nfnTV TV /^^*^ *T^/^ TV 

CTTAACGTGA 


GTTTTCGTTC 


4680 


CACTGAGCGT 


CAGACCCCGT 


TV ^ TV TV TV TV ^ TV ^n^^ 

AGAAAAGATC 


TV TV TV f^^^ TV n^^t^n^n 

AAAGGATCTT 


CTTGAGATCC 


TTTTTTTCTG 


4740 


CGCGTAATCT 


GCTGCTTGCA 


TV TV TV TV TV TV TV TV TV 

AACAAAAAAA 


CCACCGC I AC 


CAGCGGTGGT 


TTGTTTG CCG 


4800 


GATCAAGAGC 


TACCAACTCT 


TTTTCCGAAG 


GTAACTGGCT 


^n^^ TV ^T TV y^ TV 

TCAGCAGAGC 


GCAGATACCA 


4860 


AATACTGTCC 


TTCTAGTGTA 


GCCGTAGTTA 


GGCCACCACT 


TCAAGAACTC 


TGTAGCACCG 


4920 


CCTACATACC 


TCGCTCTGCT 


AATCCTGTTA 


CCAGTGGCTG 


CTGCCAGTGG 


CGATAAGTCG 


4980 


TGTCTTACCG 


GGTTGGACTC 


AAGACGATAG 


TTACCGGATA 


AGGCGCAGCG 


#n 

GTCGGGCTGA 


5040 


ACGGGGGGTT 


CGTGCACACA 


GCCCAGCTTG 


GAGCGAACGA 


CCT ACACCG A 


ACTGAGATAC 


5100 


CTACAGCGTG 


AG C T ATG AG A 


AAGCGCCACG 


CTTCCCGAAG 


^k m «k TV y^ ^1 y*v 

GGAGAAAGGC 


GGACAGGTAT 


5160 


CCGGTAAGCG 


GCAGGGTCGG 


AACAGGAGAG 


CGCACGAGGG 


AGCTTCCAGG 


GGGAAACGCC 


5220 


TGGTATCTTT 


ATAGTCCTGT 


CGGGTTTCGC 


CACCTCTGAC 


TTGAGCGTCG 


ATTTTTGTGA 


5280 


TGCTCGTCAG 


GGGGGCGGAG 


CCTATGGAAA 


AACGCCAGCA 


ACGCGGCCTT 


TTTACGGTTC 


5340 


CTGGCCTTTT 


GCTGGCCTTT 


TGCTCACATG 


TTCTTTCCTG 


CGTTATCCCC 


TGATTCTGTG 


5400 


GATAACCGTA 


TTACCGCCTT 


TGAGTGAGCT 


GATACCGCTC 


GCCGCAGCCG 


AACGACCGAG 


5460 


CGCAGCGAGT 


CAGTGAGCGA 


GGAAGCGGAA 


GAGCGCCCAA 


TACGCAAACC 


GCCTCTCCCC 


5520 


GCGCGTTGGC 


CGATTCATTA 


ATGCAGCTGG 


CACGACAGGT 


TTCCCGACTG 


GAAAGCGGGC 


5580 


AGTGAGCGCA 


ACGCAATTAA 


TGTGAGTTAG 


CTCACTCATT 


AGGCACCCCA 


GGCTTTACAC 


5640 


TTTATGCTTC 


CGGCTCGTAT 


GTTGTGTGGA 


ATTGTGAGCG 


GATAACAATT 


TCACACAGGA 


5700 


AACAGCTATG 


ACCATGATTA 


CGAATTGCGG 


CCG 
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SEQ ID 13 

GTC AA C CGC GG A CTG CGC ATC ATG AAG TTC TTC GCC ATC 
SEQ ID 14 



TCT AGC AAG CTT GGC TCT AGT TCT GGA ACC GCA CCA GGC GGC AGO AAC GGC 
AAC GGC AAT GTT TGC 
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SEQ ID 15 

TCG TAG AAG CTT TCA AGC ACC GAC GGC GGT 
SEQ ID 16 

TCT AGC TCT AGA AGC AAC GGC AAC GGC AAT GTT 
SEQ ID 17 

TGC TAG TCG ACC T GC TAG C AG CAC CGA CGG CGG TCT G 

SEQ ID 18: S» cerevxsxae FLQl coding sequence, 4614 bp 

ATGACAATGC CTCATCGCTA TATGTTTTTG GCAGTCTTTA CACTTCTGGC ACTAACTAGT 
GTGGCCTCAG GAGCCACAGA GGCGTGCTTA CCAGCAGGCC AGAGGAATiAG TGGGATGAAT 
ATAAATTTTT ACCAGTATTC ATTGAAAGAT TCCTCCACAT ATTCGAATGC AGCATATATG 
GCTTATGGAT ATGCCTCAAA AACCAAACTA GGTTCTGTCG GAGGACAAAC TGATATCTCG 
ATTGATTATA ATATTCCCTG TGTTAGTTCA TCAGGCACAT TTCCTTGTCC TCAAGAAGAT 
TCCTATGGAA ACTGGGGATG CAAAGGAATG GGTGCTTGTT CTAATAGTCA AGGAATTGCA 
TACTGGAGTA CTGATTTATT TGGTTTCTAT ACTACCCCAA CAAACGTAAC CCTAGAAATG 
ACAGGTTATT TTTTACCACC ACAGACGGGT TCTTACACAT TCAAGTTTGC TACAGTTGAC 
GACTCTGCAA TTCTATCAGT AGGTGGTGCA ACCGCGTTCA ACTGTTGTGC TCAACAGCAA 
CCGCCGATCA CATCAACGAA CTTTACCATT GACGGTATCA AGCCATGGGG TGGAAGTTTG 
CCACCTAATA TCGAAGGAAC CGTCTATATG TACGCTGGCT ACTATTATCC AATGAAGGTT 
GTTTACTCGA ACGCTGTTTC TTGGGGTACA CTTCCAATTA GTGTGACACT TCCAGATGGT 
ACCACTGTAA GTGATGACTT CGAAGGGTAC GTCTATTCCT TTGACGATGA CCTAAGTCAA 
TCTAACTGTA CTGTCCCTGA CCCTTCAAAT TATGCTGTCA GTACCACTAC AACTACAACG 
GAACCATGGA CCGGTACTTT CACTTCTACA TCTACTGAAA TGACCACCGT CACCGGTACC 
AACGGCGTTC CAACTGACGA AACCGTCATT GTCATCAGAA CTCCAACAAC TGCTAGCACC 
ATCATAACTA CAACTGAGCC ATGGAACAGC ACTTTTACCT CTACTTCTAC CGAATTGACC 
ACAGTCACTG GCACCAATGG TGTACGAACT GACGAAACCA TCATTGTAAT CAGAACACCA 
ACAACAGCCA CTACTGCCAT AACTACAACT GAGCCATGGA ACAGCACTTT TACCTCTACT 
TCTACCGAAT TGACCACAGT CACCGGTACC AATGGTTTGC CAACTGATGA GACCATCATT 



GTCATCAGAA CACCAACAAC AGCCACTACT 
ACTTTTACCT CTACATCCAC TGAAATGACC 
GATGAAACCA TCATTGTCAT CAGAACACCA 
CAGCCATGGA ACGACACTTT TACCTCTACA 
AACGGTTTGC CAACTGATGA AACCATCATT 
GCCATGACTA CAACTCAGCC ATGGAACGAC 
ACCGTCACCG GTACCAATGG TTTGCCAACT 
ACAACAGCCA CTACTGCCAT GACTACAACT 
TCCACTGAAA TGACCACCGT CACCGGTACC 
GTCATCAGAA CACCAACAAC AGCCACTACT 
ACTTTTACCT CTACTTCTAC CGAATTGACC 
GATGAGACCA TCATTGTCAT CAGAACACCA 
CAGCCATGGA ACGACACTTT TACCTCTACA 
AACGGTTTGC CAACTGATGA AACCATCATT 
GCCATGACTA CAACTCAGCC ATGGAACGAC 
ACCGTCACCG GTACCAACGG TTTGCCAACT 
ACAACAGCCA CTACTGCCAT GACTACAACT 
TCCACTGAAA TGACCACCGT CACCGGTACC 
GTCATCAGAA CTCCAACTAG TGAAGGTCTA 
ACTTTCACCT CTACATCCAC TGAGATGACC 
GACGAAACCG TGATTGTTAT CAGAACTCCA 
GAACCATGGA CTGGTACTTT TACTTCTACA 
AACGGCGTTC CAACTGACGA AACCGTCATT 
ATCAGCACCA CCACTGAACC ATGGACTGGT 
ACCATTACTG GAACCAATGG TCAACCAACT 
ACTAGTGAAG GTCTAATCAG CACCACCACT 
TCTACTGAAA TGACCACCGT CACCGGTACC 
GTCATCAGAA CTCCAACCAG TGAAGGTCTA 
ACTTTCACTT CGACTTCCAC TGAGGTTACC 
GACGAAACTG TGATTGTTAT CAGAACTCCA 
GAACCATGGA CTGGTACTTT CACTTCTACA 
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GCCATGACTA CAACTCAGCC ATGGAACGAC 
ACCGTCACCG GTACCAACGG TTTGCCAACT 
ACAACAGCCA CTACTGCTAT GACTACAACT 
TCCACTGAAA TGACCACCGT CACCGGTACC 
GTCATCAGAA CACCAACAAC AGCCACTACT 
ACTTTTACCT CTACATCCAC TGAAATGACC 
GATGAGACCA TCATTGTCAT CAGAACACCA 
CAGCCATGGA ACGACACTTT TACCTCTACA 
AACGGTTTGC CAACTGATGA AACCATCATT 
GCCATAACTA CAACTCAGCC ATGGAACAGC 
ACAGTCACCG GTACCAATGG TTTGCCAACT 
ACAACAGCCA CTACTGCCAT GACTACAACT 
TCCACTGAAA TGACCACCGT CACCGGTACC 
GTCATCAGAA CACCAACAAC AGCCACTACT 
ACTTTTACCT CTACATCCAC TGAAATGACC 
GATGAGACCA TCATTGTCAT CAGAACACCA 
CAGCCATGGA ACGACACTTT TACCTCTACA 
AACGGCGTTC CAACTGACGA AACCGTCATT 
ATCAGCACCA CCACTGAACC ATGGACTGGT 
ACCGTCACCG GTACTAACGG TCAACCAACT 
ACCAGTGAAG GTTTGGTTAC AACCACCACT 
TCTACTGAAA TGACCACCAT TACTGGAACC 
GTCATCAGAA CTCCAACCAG TGAAGGTCTA 
ACTTTTACTT CTACATCTAC TGAAATGACC 
GACGAAACCG TTATTGTTAT CAGAACTCCA 
GAACCATGGA CTGGTACTTT CACTTCTACA 
AACGGCGTTC CAACTGACGA AACCGTCATT 
ATCAGCACCA CCACTGAACC ATGGACTGGC 
ACCATCACTG GAACCAACGG TCAACCAACT 
ACCAGTGAAG GTCTAATCAG CACCACCACT 
TCTGCTGAAA TGACCACCGT CACCGGTACT 



AACGGTCAAC CAACTGACGA AACCGTGATT 
GTTACAACCA CCACTGAACC ATGGACTGGT 
ACTGTCACTG GAACCAATGG CTTGCCAACT 
ACTACTGCCA TCTCATCCAG TTTGTCATCA 
ACGTCTTCGC GTCCAATTAT TACCCCATTC 
TCCTCAGTAA TTTCTTCCTC AGTCACTTCT 
TCCTCAGTCA TTTCTTCTTC TACAACAACC 
TCATCCGTCA TTCCAACCAG TAGTTCCACC 
GCTGGTTCTG TCTCTTCTTC CTCTTTTATC 
TCTTCTTCAT CATTACCACT TGTTACCAGT 
TTACCACCTG CTACCACTAC AAAAACGAGC 
TGCGAGTCTC ATGTGTGCAC TGAATCCATC 
ACTGTTAGCG GCGTCACAAC AGAGTATACC 
ACAAAGCAAA CCAAAGGGAC AACAGAGCAA 
GTTACAATTT CTTCTTGTGA ATCTGACGTA 
TCTACAAGCA CTGCTACTAT TAACGGCGTT 
TCCACCACAG AATCGAGGCA ACAAACAACG 
GTGTGTTCCG AAACTGCTTC ACCTGCCATT 
GTTGTTACGG TCTATCCTAC ATGGAGGCCA 
AAAATGAACA GTGCTACCGG TGAGACAACA 
AATACTGTAG CTGCTGAGAC GATTACCAAT 
ACGTCTTCGC TTTCAAGATC TAATCACGCT 
ATTGGTCACA GCAGTAGTGT TGTTTCTGTA 
AGTTCCGGGT TGAGTACTAT GTCGCAACAG 
GGATATAGTA CAGCTTCTTT AGAAATTTCA 
GCCGGTAGTG GTTTAAGTGT CTTCATTGCG 
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GTTATCAGAA CTCCAACCAG TGAAGGTTTG 
ACTTTTACTT CGACTTCCAC TGAAATGTCT 
GATGAAACTG TCATTGTTGT CAAAACTCCA 
TCATCTTCAG GACAAATCAC CAGCTCTATC 
TATCCTAGCA ATGGAACTTC TGTGATTTCT 
TCTCTATTCA CTTCTTCTCC AGTCATTTCT 
TCCACTTCTA TATTTTCTGA ATCATCTAAA 
TCTGGTTCTT CTGAGAGCGA AACGAGTTCA 
TCTTCTGAAT CATCAAAATC TCCTACATAT 
GCGACAACAA GCCAGGAAAC TGCTTCTTCA 
GAACAAACCA CTTTGGTTAC CGTGACATCC 
TCCCCTGCGA TTGTTTCCAC AGCTACTGTT 
ACATGGTGCC CTATTTCTAC TACAGAGACA 
ACCACAGAAA CAACAAAACA AACCACGGTA 
TGCTCTAAGA CTGCTTCTCC AGCCATTGTA 
ACTACAGAAT ACACAACATG GTGTCCTATT 
CTAGTTACTG TTACTTCCTG CGAATCTGGT 
GTTTCGACGG CCACGGCTAC TGTGAATGAT 
CAGACTGCGA ATGAAGAGTC TGTCAGCTCT 
ACCAATACTT TAGCTGCTGA AACGACTACC 
ACTGGAGCTG CTGAGACGAA AACAGTAGTC 
GAAACACAGA CGGCTTCCGC GACCGATGTG 
TCCGAAACTG GCAACACCAA GAGTCTAACA 
CCTCGTAGCA CACCAGCAAG CAGCATGGTA 
ACGTATGCTG GCAGTGCCAA CAGCTTACTG 
TCCTTATTGC TGGCAATTAT TTAA 



48 

SEQ ID 19: Sequence of pYES2 

Comments for pYES2 : 
5857 nucleotides 

GALl promoter: bases 1-452 

T7 promoter/priming site: bases 476-495 

Multiple cloning site: bases 502-601 

CYCl transcription terminator: bases 609-857 

pMBl (pUC-derived) origin: bases 103 9-1712 

Ampicillin resistance gene: bases 1857-2717 

URA3 gene: bases 2735-3842 

2 micron origin: bases 3846-5317 

fl origin: bases 5385-5840 

ACGGATTAGAAGCCGCCGAGCGGGTGACAGCCCTCCGAAGGAAGACTCTCCTCCGTGCGTCCTCGTCCTC 
ACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATA 
CTAGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATGAACGAAT 
CAAATTAACAACCATAGGATGATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGA 
AGCGATGATTTTTGATCTATTAACAGATATATAAATGCAAAAACTGCATTAACCACTTTAACTAATACTT 
TCAACATTTTCGGTTTGTATTACTTCTTATTCAAATGTAATAAAAGTATCAACAAAAAATTGTTAATATA 
CCTCT AT ACTTT AACGTCAAGG AG AAAAAACCCCGG AT CGG ACT ACT AGCAGCTGT AAT ACGACTCACT A 
TAGGGTVATATTAAGCTTGGTACCGAGCTCGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCTGCA 
GATATCCATCACACTGGCGGCCGCTCGAGCATGCATCTAGAGGGCCGCATCATGTAATTAGTTATGTCAC 
GCTTACATTCACGCCCTCCCCCCACATCCGCTCTAACCGAAAAGGAACGAGTTAGACAACCTGAAGTCTA 
GGTCCCTATTTATTTTTTTATAGTTATGTTAGTATTAAGAACGTTATTTATATTTCAAATTTTTCTTTTT 
TTTCTGTACAGACGCGTGTACGCATGTAACATTATACTGAAAACCTTGCTTGAGAAGGTTTTGGGACGCT 
CGAAGGCTTTAATTTGCGGCCCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTG 
GGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGC 
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tcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaa 
ggccagcaaaagcccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctg 
acgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggc 
gtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcc 
tttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcg 
ttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaacta 
tcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagc 
agagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagga 
cagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccx;g 
caaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaagga 
tctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaaggga 
ttttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatc 
aatctaaagtatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctca 
gcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagataactacgatacgggagc 
gcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagc 

AATAAACCAGCCAG CCGG AAGGGCCGAG CGC AG AAG TGG TCCTGC AACTTTATCCGCCTCCATTCAG TCT 

attaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcgcaacgttgttggcattg 
ctacaggcatcgtggtgtcactctcgtcgtttggtatggcttcattcagctccggttcccaacgatcaag 
gcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcaga 
agtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgtcatgccat 
ccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgacc 
gagttgctcttgcccggcgtcaatacgggataatagtgtatcacatagcagaactttaaaagtgctcatc 
attggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaac 
ccactcgtgcacccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacagg 
aaggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttccttttt 
caatgggtaataactgatataattaaattgaagctctaatttgtgagtttagtatacatgcatttactta 
taatacagttttttagttttgctggccgcatcttctcaaatatgcttcccagcctgcttttctgtaacgt 
tcaccctctaccttagcatcccttccctttgcaaatagtcctcttccaacaataataatgtcagatcctg 
tagagaccacatcatccacggttctatactgttgacccaatgcgtctcccttgtcatctaaacccacacc 
gggtgtcataatcaaccaatcgtaaccttcatctcttccacccatgtctctttgagcaataaagccgata 
acaaaatctttgtcgctcttcgcaatgtcaacagtacccttagtatattctccagtagatagggagccct 
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TGCATGACAATTCTGCTAACATCAAAAGGCCTCTAGGTTCCTTTGTTACTTCTTCTGCCGCCTGCTTCAA 

ACCGCTAACAATACCTGGGCCCACCACACCGTGTGCATTCGTAATGTCTGCCCATTCTGCTATTCTGTAT 

ACACCCGCAGAGTACTGCAATTTGACTGTATTACCAATGTCAGCAAATTTTCTGTCTTCGAAGAGTAAAA 

AATTGTACTTGGCGGATAATGCCTTTAGCGGCTTAACTGTGCCCTCCATGGAAAAATCAGTCAAGATATC 

CACATGTGTTTTTAGTAAACAAATTTTGGGACCTAATGCTTCAACTAACTCCAGTAATTCCTTGGTGGTA 

CGAACATCCAATGAAGCACACAAGTTTGTTTGCTTTTCGTGCATGATATTAAATAGCTTGGCAGCAACAG 

GACTAGGATGAGTAGCAGCACGTTCCTTATATGTAGCTTTCGACATGATTTATCTTCGTTTCCTGCAGGT 

TTTTGTTCTGTGCAGTTGGGTTAAGAATACTGGGCAATTTCATGTTTCTTCAACACTACATATGCGTATA 

TATACCAATCTAAGTCTGTGCTCCTTCCTTCGTTCTTCCTTCTGTTCGGAGATTACCGAATCAAAAAAAT 

TTCAAAGAAACCGAAATCAAAAAAAAGAATAAAAAAAAAATGATGAATTGAATTGAAAAGCTAGCTTATC 

GATGATAAGCTGTCAAAGATGAGAATTAATTCCACGGACTATAGACTATACTAGATACTCCGTCTACTGT 

ACGATACACTTCCGCTCAGGTCCTTGTCCTTTAACGAGGCCTTACCACTCTTTTGTTACTCTATTGATCC 

AGCTCAGCAAAGGCAGTGTGATCTAAGATTCTATCTTCGCGATGTAGTAAAACTAGCTAGACCGAGAAAG 

AGACTAGAAATGCAAAAGGCACTTCTACAATGGCTGCCATCATTATTATCCGATGTGACGCTGCAGCTTC 

TCAATGATATTCGAATACGCTTTGAGGAGATACAGCCTAATATCCGACAAACTGTTTTACAGATTTACGA 

TCGTACTTGTTACCCATCATTGAATTTTGAACATCCGAACCTGGGAGTTTTCCCTGAAACAGATAGTATA 

TTTGAACCTGTATAATAATATATAGTCTAGCGCTTTACGGAAGACAATGTATGTATTTCGGTTCCTGGAG 

AAACTATTGCATCTATTGCATAGGTAATCTTGCACGTCGCATCCCCGGTTCATTTTCTGCGTTTCCATCT 

TGCACTTCAATAGCATATCTTTGTTAACGAAGCATCTGTGCTTCATTTTGTAGAACAAAAATGCAACGCG 

AGAGCGCTAATTTTTCAAACAAAGAATCTGAGCTGCATTTTTACAGAACAGAAATGCAACGCGAAAGCGC 

TATTTTACCAACGAAGAATCTGTGCTTCATTTTTGTAAAACAAAAATGCAACGCGACGAGAGCGCTAATT 

TTTCAAACAAAGAATCTGAGCTGCATTTTTACAGAACAGAAATGCAACGCGAGAGCGCTATTTTACCAAC 

AAAGAATCTATACTTCTTTTTTGTTCTACAAAAATGCATCCCGAGAGCGCTATTTTTCTAACAAAGCATC 

TTAGATTACTTTTTTTCTCCTTTGTGCGCTCTATAATGCAGTCTCTTGATAACTTTTTGCACTGTAGGTC 

CGTTAAGGTTAGAAGAAGGCTACTTTGGTGTCTATTTTCTCTTCCATAAAAAAAGCCTGACTCCACTTCC 

CGCGTTTACTGATTACTAGCGAAGCTGCGGGTGCATTTTTTCAAGATAAAGGCATCCCCGATTATATTCT 

ATACCGATGTGGATTGCGCATACTTTGTGAACAGAAAGTGATAGCGTTGATGATTCTTCATTGGTCAGAA 

AATTATGAACGGTTTCTTCTATTTTGTCTCTATATACTACGTATAGGAAATGTTTACATTTTCGTATTGT 

TTTCGATTCACTCTATGAATAGTTCTTACTACAATTTTTTTGTCTAAAGAGTAATACTAGAGATAAACAT 

AAAAAATGTAGAGGTCGAGTTTAGATGCAAGTTCAAGGAGCGAAAGGTGGATGGGTAGGTTATATAGGGA 

TATAGCACAGAGATATATAGCAAAGAGATACTTTTGAGCAATGTTTGTGGAAGCGGTATTCGCAATGGGA 
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AGCTCCACCCCGGTTGATAATCAGAAAAGCCCCAAAAACAGGAAGATTGTATAAGCAAATATTTAAATTG 
TAAACGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACGAATAGCC 
CGAAATCGGCAAAATCCCTTATAAATCAAAAGAATAGACCGAGATAGGGTTGAGTGTTGTTCCAGTTTCC 
AACAAGAGTCCACTATTAAAGAACGTGGACTCCAACGTCAAAGGGCGAAAAAGGGTCTATCAGGGCGATG 
GCCCACTACGTGAACCATCACCCTAATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCAGTAAATCGGAA 
GGGTAAACGGATGCCCCCATTTAGAGCTTGACGGGGAAAGCCGGCGAACGTGGCGAGAAAGGAAGGGAAG 
AAAGCGAAAGGAGCGGGGGCTAGGGCGGTGGGAAGTGTAGGGGTCACGCTGGGCGTAACCACCACACCCG 
CCGCGCTTAATGGGGCGCTACAGGGCGCGTGGGGATGATCCACTAGT 
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Claims 

1. A process for partitioning of molecules in aqueous two-phase systems (ATPS), 
comprising the steps of 

5 a) constructing a fusion molecule by combining a molecule of interest to a targeting 
protein having the ability to carry said molecule of interest into one of the phases, and 
b) subjecting said fusion molecule to an ATPS separation. 

2. The process according to claim 1, wherein the targeting protein is a hydrophobic 
10 protein. 

3. The process according to claim 1, wherein the targeting protein is selected from a 
group consisting of amphipathic proteins and proteins which form amphipathic aggre- 
gates. 

15 

4. The process according to claim 1, wherein the targeting protein is a hydrophobin-like 
protein, 

5. The process according to claim 4, wherein the hydrophobin-like protein is a class 2 
20 hydrophobin. 

6. The process according to claim 5, wherein the hydrophobin is a Trichoderma 
hydrophobin. 

25 7. The process according to claim 6, wherein the Trichoderma hydrophobin is HFBI, 
HFBII or SRHl. 

8. A process for partitioning of particles, wherein the particles contain a targeting protein 
as defined in any one of claims 2 to 7 or a part thereof on their surface, the process 

30 comprising the step of 

- subjecting the particles to ATPS separation. 

9. The process according to claim 8, wherein the particles are cells. 
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10. The process according to claim 9, wherein the cells are yeast cells. 

11. The process according to claim 8, wherein the particles are spores. 

5 12. The process according to any one of claims 8 to 11, wherein the targeting protein 
is fused to a molecule which brings the targeting protein onto the surface of the particle. 

13. The process according to any one of claims 1 to 12, wherein the aqueous two-phase 
system is selected from the group consisting of PEG/salt, PEG/Dextran and PEG/starch 

10 systems or derivatives thereof, detergent-based aqueous two-phase systems and 
thermoseparating polymer systems. 

14. The process according to claim 13, wherein the detergent-based ATPS comprises 
a detergent which is selected from the group consisting of nonionic or zwitterionic 

15 detergents. 

15. The process according to claim 13, wherein the thermoseparating polymer system 
comprises a polymer which is selected from the group consisting of polyethylene- 
polypropylene copolymers. 

20 

16. The process according to any one of claims 1-15, wherein the molecule of interest 
or the particle is separated from a suspension containing cells or cell extracts. 

17. A fusion molecule, comprising a hydrophobin-Uke protein as defined in any one of 
25 claims 4 to 7 fused to a molecule of interest. 

18. A fusion molecule according to claim 17, wherein the molecule of interest is a cell- 
bound protein or a part thereof. 

30 19. The fusion molecule according to claim 17, wherein the molecule of interest is an 
extracellular protein or a part thereof. 
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20. The fusion molecule according to claim 19, wherein the extracellular protein is an 
extracellular protein of Trichoderma, selected from the group consisting of cellulases, 
hemicellulases and proteases. 

5 21. The fusion molecule according to claim 17, wherein the molecule of interest is an 
antibody molecule or a part thereof. 

22. The process according to claim 1, wherein the targeting protein is fused to the 
molecule of interest according to any one of the claims 18 to 21. 

10 

23- A recombinant organism producing a fusion molecule according to any one of 
claims 17 to 21. 



24. The recombinant organism according to claim 23, wherein the organism has been 
15 genetically modified to be capable of producing a fusion molecule according to any one 

of claims 17 to 21. 

25. A recombinant DNA molecule, comprising a DNA molecule encoding a fusion 
molecule according to any one of claims 17 to 21. 

20 

26. A process for producing a targeting protein as defined in any one of claims 4 to 7, 
or a fusion molecule according to any one of the claims 17 to 21 with recombinant 
organisms, the process comprising the steps of 

a) transforming the recombinant organism with DNA molecules enabling expression of 
25 such molecules, and 

b) recovering such molecules from the culture of the recombinant organism. 

27. A process for separating hydrophobin-likc molecules in aqueous two-phase systems, 
the process comprising the steps of 

30 a) mixing solutions containing said hydrophobin-like molecule with the phase forming 
chemicals, and 

b) carrying out ATPS separation, 

wherein the aqueous two-phase system is as defined in any one of claims 13 to 15. 



(57) Abstract 

The present invention relates to isolation and purification of 
proteins in aqueous two-phase systems (ATPS). Specifically 
the invention provides processes for partitioning of molecules 
of interest in ATPS by fusing said molecules to targeting 
proteins which have the ability of carrying said molecule into 
one of the phases. 
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Figure 8. Coomassie stained 10 % SDS-PAGE of the partitioning of EGIcove-HFBI fusion protein in 
two-phase separation using 5 % of the detergent C12-C18E05. Lane 1, Molecular weight marker; Lane 
2, Purified CBHI (4 |jg); Lane 3, Purified EGI (4 pg); Lane 4, I/IO diluted VTT-D-98691 cellulose- 
based culture filtrate; Lanes 5 and 6, I/IO diluted bottom phase and detergent phase (top phase), 
respectively, after separation of VTT-D-98691 culture filtrate with 5% detergent; Lane 7, Non-diluied 
bottom phase; Lane 8, Non-diluted VTT-D-98691 cellulose culture filtrate. 
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Figure 9. Western analysis of the partitioning of EGIcore-HFBI fusion protein in two-phase separation 
by using different concentrations of the detergent C12-C18E05. Fusion proteins were detected with 
an ti-HFB I. antibodies. Lane 1, Molecular weight marker; Lane 2, Purified EGI; Lane 3, VTT-D-9869I 
cellulose culture filtrate; Lanes 4 and 5, Detergent phase (top phase) and bottom phase, respectively, 
after separation of VTT-D-98691 culture filtrate with 5% detergent; Lane 6, Same as lane 3, except 2 
% detergent was used; Lane 7, Same as lane 4, except 2 % detergent was used; Lane 8, Purified EGI: 
Lane 9:, Purified CBHl. 
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Figure 10. Coomassie stained 10 % SDS-PAGE siiowing funher purification of EGIcore-HFBI fusion 
protein from the endogenous CBHl when ihe top phase was re-extracted with 2 % detergent. Lane 1, 
Molecular weight marker; Lane 2, Purified CBHI (4 \xg): Lane 3, Purified EGI (4 jag); Lane 4, 
Detergent phase (top phase) after first extraction; Lane 5, Detergent phase (top phase) after second 
extraction. 



1 2 3 4 5 6 




Figure 11. Coomassie stained 10 % SDS-PAGE analysis of the EGI-HFBi protein when treated with 
thrombin. Lane 1, Molecular weight marker; Lane 2, EGI-HPBI (1 mg/ml) treated 72 h with 3 U of 
thrombin at 24°C; Lane 3, Same as lane 2, except no thrombin was added; Lane 4, EGI-HFBI (1 
mg/ml) treated 48 h witli 9 U of thrombin at 36°C; Lane 5, Same as lane 4, except no thrombin was 
added; Lane 6, Same as lane 5, except no incubation ai 36"C. 
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Figure 13. Western analysis of the partitioning of dCBD-HFBl fusion protein in two-phase separation 
using 5 % of the detergent C12-C18E05. Fusion protein was detected with anti-HFBI antibody. Lane 
1, Four times concentrated culture filtrate; Lane 2, Four times concentrated bottom phase; Lane 3, Top 
phase. 
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A non-functional restriction site is indicated with an asterisk. 
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